GENERAL  PURPOSE  AND  SCOPE.  The  Applied  Computational  Electromagnetics  Society  Journal 
hereinafter  known  as  the  ACES  Journal  is  devoted  to  the  exchange  of  Information  In  computational 
electromagnetics,  to  the  advancement  of  the  state-of-the-art,  and  to  the  promotion  of  related  technical 
activities.  A  primary  objective  of  the  information  exchange  is  the  elimination  of  the  need  to  “re-invent 
the  wheel"  to  solve  a  previously-solved  computational  problem  in  electrical  engineering,  physics,  or 
related  fields  of  study.  The  technical  activities  promoted  by  this  publication  include  code  validation, 
performance  analysis,  and  input/ output  standardization;  code  or  technique  optimization  and  error 
minimization;  innovations  in  solution  technique  or  in  data  input/output;  identification  of  new  applica¬ 
tions  for  electromagnetics  modeling  codes  and  techniques;  integration  of  computational  electromagnet¬ 
ics  techniques  with  new  computer  architectures;  and  correlation  of  computational  parameters  with 
physical  mechanisms. 

SUBMISSIONS  CONTENT.  The  ACES  Journal  welcomes  original,  previously  unpublished  papers, 
relating  to  applied  computational  electromagnetics . 

Typical  papers  will  represent  the  computational  electromagnetics  aspects  of  research  in  electrical 
engineering,  physics,  or  related  disciplines.  However,  papers  which  represent  research  in  applied 
computational  electromagnetics  itself  are  equally  acceptable. 

For  additional  details,  see  “Information  for  Authors”,  elsewhere  in  this  issue. 

SUBSCRIPTIONS.  All  members  of  the  Applied  Computational  Electromagnetics  Society  (ACES)  who  have 
paid  their  subscription  fees  are  entitled  to  receive  the  ACES  Journal  with  a  minimum  of  two  Issues  per 
calendar  year.  Current  annual  subscription  fees  are: 

AREAS:  U.S.  and  Canada:  AIRMAIL:  $55  Individual,  $105  Organizational,  SURFACE  MAIL  N/A; 
Mexico,  Central  &  So.  America:  AIRMAIL:  $60  Individual,  $115  Organizational,  SURFACE  MAIL:  $58; 
Europe,  Former  USSR  Turkey,  Scandinavia,  AIRMAIL:  $68  Individual,  $1 15  Organizational,  SURFACE 
MAIL:  $58;  Asia  Africa  Middle  East  and  Pacific  Rim:  AIRMAIL:  $75  Individual,  $115  Organizational, 
SURFACE  MAIL:  $58.  FULL-TIME  STUDENTS:  $25.00.  REMIT  BY:  (1)  BANK  DRAFTS  (MUST  BE  DRAWN 
ON  U.S.  BANK),  (2)  INTERNATIONAL  MONEY  ORDER,  (3)  TRAVELER  S  CHECKS  IN  U.S.  DOLLARS, 
(4)  ELECTRONIC  TRANSFER.  (CONTACT  ACES  SECRETARY). 

Back  issues,  when  available,  are  $15.00  each.  The  Special  Issue  on  Canonical  Problems  is  $15.00. 
Subscriptions  to  ACES,  orders  for  back  issues  of  the  ACES  Journal  and  changes  of  addresses  should  be 
sent  to: 

Dr.  Richard  Adler 
ACES  Secretary 
Naval  Postgraduate  School 
Code  EC/AB 
Monterey.  CA  93943  USA 
Fax:  (408)  649-0300 

Allow  four  week's  advance  notice  for  change  of  address.  Claims  for  missing  issues  will  not  be  honored 
because  of  Insufficient  notice  of  address  change  or  loss  in  mail  unless  the  secretary  is  notified  within  60 
days  for  USA  and  Canadian  subscribers  or  90  days  for  subscribers  in  other  countries,  from  the  last  day 
of  the  month  of  publication.  For  information  regarding  reprints  of  individual  papers  or  other  materials, 
see  “Information  for  Authors". 

LIABILITY.  Neither  ACES  or  the  ACES  Journal  editors  are  responsible  for  any  consequence  of 
misinformation  or  claims,  express  or  implied,  in  any  published  material  in  an  ACES  Journal  issue.  This 
also  applies  to  advertising,  for  which  only  camera-ready  copies  are  accepted.  Authors  are  responsible 
for  Information  contained  in  their  papers.  If  any  material  submitted  for  publication  includes  material 
which  has  already  been  published  elsewhere,  it  is  the  author's  responsibility  to  obtain  written  permission 
to  reproduce  such  material. 


Summer  1992 
Vol.  7  No.  1 

ISSN  1054-4 887 


The  ACES  Journal  la  abstracted  In  INSPEC,  In  Engineering  Index,  and  in  DTIC. 

The  first,  third,  fourth,  and  fifth  illustrations  on  the  front  cover  have  been  obtained  from  Lawrence 
Livermore  National  Laboratory. 

The  second  illustration  on  the  front  cover  has  been  obtained  from  FLUX2D  software  — CEDRAT  S.A. 
France  --  MAGSOFT  Corporation,  New  York. 


1 


CD 


THE  APPLIED  COMPUTATIONAL  ELECTROMAGNETICS 


SOCIETY  JOURNAL 

EDITORS 


MANAGING  EDITOR 
Richard  W.  Adler 
Naval  Postgraduate  School 
Code  EC/AB 

Monterey,  CA  93943,  U.S.A. 


EDITOR-IN-CHIEF.  EMERITUS 
Robert  M.  Bevensee,  U.S.A. 


EDITOR-IN-CHIEF 
David  E.  Stein 

Headquarters  Air  Force  Systems  Command 
P.O.  Box  169 

Unthlcum  Heights,  MD  21090  U.S.A. 


Ruediger  Anders  Tony  Fleming 

Applied  Electromagnetics  Engineering  Telecom  Australia 

Mahwah.  NJ  07430  Clayton.  Victoria,  AUSTRALIA 


Kenzo  Mlya 
University  of  Tokyo 
Tokyo,  JAPAN 


Harold  W.  Asklns 
The  Citadel 
Charleston.  SC,  USA 


Pat  Foster 

Microwave  &  Antenna  Systems 
Worcestershire,  UK 


Giorgio  Mollnarl 
University  of  Genova 
Genova,  ITALY 


Brian  A.  Austin 
University  of  Liverpool 
Liverpool.  UK 

Duncan  C.  Baker 
University  of  Pretoria 
Pretoria.  SOUTH  AFRICA 

Fuhlo  Bessl 

Ingegnerla  del  Slstemt  S.p.A. 
Pisa.  ITALY 

Robert  M.  Bevensee 
Consultant 
Alamo.  CA,  USA 

John  R.  Bowler 
University  of  Surrey 
Surrey,  UK 

James  K.  Breaks  11 
Pennsylvania  State  University 
University  Park.  PA.  USA 


Gregory  R.  Haack 
DSTO 

Salisbury.  AUSTRALIA 

Christian  Hafner 

Swiss  Federal  Institute  of  Te 

Zurich.  SWITZERLAND 

Roger  Harrington 
Syracuse  University 
Syracuse,  NY,  USA 

Kiyohlko  Itoh 
Hokkaido  University 
Sapporo,  JAPAN 

Adalbert  Konrad 
University  of  Toronto 
Toronto,  Ontario,  CANADA 

Stanley  Kublna 
Concordia  University 
Montreal,  Quebec,  CANADA 


Frederic  A.  Molinet 

Soclete  Motheslm 

Le  Plessls-  Robinson.  FRANCE 

Gerrlt  Mur 

Technlsche  Unlversltelt  Delft 
Delft.  NETHERLANDS 

Takayoshl  Nakata 
Okayama  University 
Okayama,  JAPAN 

Andrew  F.  Peterson 
GA  Institute  of  Technology 
Atlanta.  GA.  USA 

Harold  A.  Sabbagh 
Sabbagh  Associates  Inc. 
Bloomington,  IN,  USA 

Ted  L.  Simpson 
University  of  S.  Carolina 
Columbia,  SC,  USA 


Robert  T.  Brown 

Lockheed  Aeronautical  Sys.  Co. 

Valencia,  CA.  USA 


Karl  J.  Langenberg 
Unlversitat  Kassel 
Kassel,  GERMANY 


Chris  Smith 

Kaman  Sciences  Corp. 

Colorado  Springs,  CO,  USA 


Chalmers  M.  Butler 
Clemson  University 
Clemson,  SC,  USA 

Dawson  Coblln 

Lockheed  Missiles  &  Space  Co. 
Sunnyvale,  CA.  USA 

Edgar  Coffey 

Advanced  Electromagnetics 
Albuquerque.  NM,  USA 

Peter  S.  Excel! 

University  of  Bradford 
West  Yorkshire,  UK 


Ray  Luebbers 

Pennsylvania  State  University 
University  Park,  PA,  USA 

Andrew  L.  Maffett 
Consultant 
Dexter,  MI.  USA 

Ronald  Marhefka 
Ohio  State  University 
Columbus,  OH,  USA 

Edmund  K.  Mlllier 
Los  Alamos  National  Lab. 

Los  Alamos,  NM,  USA 


C.  W.  "Bill"  Trowbridge 
Vector  Fields  Limited 
Oxford.  UK 

Jean-Claude  Verlte 
Electrlclte  de  France 
Clamart  Cedex,  FRANCE 

John  W.  Williams 
Science  Applications  Int. 
Albuquerque,  NM,  USA 

Manfred  Wurm 
Industrleanlagen- 
Betrfebsgesellschaft  mbll 
Ottobrunn.  GERMANY 


2 


THE  APPLIED  COMPUTATIONAL  ELECTROMAGNETICS 


SOCIETY  JOURNAL 


Vol.  7  No.  1 


Summer  1992 


TABLE  OF  CONTENTS 


"Electromagnetic  Scattering  From  Radially  or  Axially  Inhomogeneous  Objects" 

by  A.A.  Klshk  and  M.  Abouzahra . 4 

"Application  of  Equivalent  Edge  Currents  to  Correct  the  Backscattered  Physical  Optics 
Field  of  Flat  Plates" 

by  V.  Stein . 24 

"Application  of  Parallel  Processing  To  a  Surface  Patch/Wire  Junction  EFIE  Code" 

by  L.C.  Russell  and  J.W.  Rockway . 48 

"A  Recursive  Technique  to  Avoid  Arithmetic  Overflow  and  Underflow  When  Computing 
Slowly  Convergent  Eigenfunction  Type  Expansions" 

by  G.A.  Somers  and  B.  A.  Munk . 67 

"H-Oriented'  and  'B-Oriented'  Methods  In  a  Problem  of  Nonlinear  Magnetostatics:  Some 
Methodological  Remarks" 

by  A.  Bossavit . 78 

Institutional  Members . 90 


©  1992,  The  Applied  Computational  Electromagnetics  Society 


3 


ELECTROMAGNETIC  SCATTERING  FROM  RADIALLY  OR  AXIALLY 

INHOMOGENEOUS  OBJECTS 
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ABSTRACT 

A  computer  program  based  on  the  method  of  moments  approach  is  developed  to  compute 
electromagnetic  scattering  from  axisymmetric  objects.  The  object  may  consist  of  N  linear 
isotropic  homogeneous  regions.  These  regions  may  be  arranged  axially  and/or  radially  with  the 
axis  of  symmetry.  Surface  integral  equations  (SIE)  formulation,  E-PMCHW,  is  used  to 
formulate  the  problem.  Other  formulations  can  easily  be  incorporated  in  the  computer  code. 
Bistatic  and  monostatic  Radar  Cross  Sections  (RCS)  for  several  benchmark  geometries  are 
computed.  The  computed  results  are  verified  by  comparison  with  measured  and  exact  calculated 
results.  In  some  cases  the  self-consistency  method  is  used  to  perform  the  verification.  The 
measured  and  calculated  data  presented  in  this  paper  are  expected  to  serve  as  benchmarks  for 
other  researchers  in  the  field.1 

I  INTRODUCTION 

The  study  of  electromagnetic  scattering  from  composite  materials  has  become  of  interest 
to  many  engineering  societies.  For  example,  in  biomedicine  modeling  of  human  bodies  and 
tissues  requires  very  complex  composite  objects.  With  the  relative  increase  in  the  complexity 
of  the  objects  of  interest,  numerical  solutions  become  necessary  to  study  the  electromagnetic 
characteristics  of  these  objects.  Various  numerical  methods  can  be  used  to  solve  such  problems 


1  This  work  was  sponsored  by  the  Department  of  the  Air  Force.  The  views  expressed  are  those  of  the 
authors  and  do  not  reflect  the  official  policy  or  position  of  the  US  Government. 


4 


[1-6],  however,  each  method  has  limitations. 

The  method  of  moments  has  been  proven  to  be  efficient  in  solving  Surface  Integral 
Equations  (SIE).  The  SIE  formulation  is  most  suitable  for  objects  made  of  linearly  isotropic 
homogeneous  materials.  For  inhomogeneous  objects,  other  formulations  may  be  used,  such  as 
the  Volume  Integral  Equations  (VIE)  or  Partial  Differential  Equations  (PDE)  formulations. 
However,  if  the  inhomogeneity  of  the  material  is  simple,  the  use  of  SIE  may  be  preferred, 
because  the  matrix  size  will  be  smaller  using  SIE  than  the  matrices  that  can  be  obtained  from 
other  formulations. 

In  this  paper,  the  SIE  formulation  is  used  to  assess  the  problem  of  electromagnetic 
scattering  from  bodies  of  revolution  made  of  multihomogeneous  regions.  The  present 
formulation  is  similar  to  the  one  given  in  [1].  The  method  of  moments  is  used  to  solve  the  SIE. 
The  resulting  matrix  system  is  a  buildup  of  the  basic  Z  and  Y  matrix  obtained  for  conducting 
or  dielectric  bodies  of  revolution  [7-8].  The  computer  program  that  has  been  developed  to 
compute  the  bistatic  or  monostatic  Radar  Cross  Sections  (RCS)  is  tested  and  verified,  and  several 
examples  are  selected  to  show  the  accuracy  of  this  program  in  predicting  RCS.  Only  one 
formulation  is  discussed;  however,  the  program  has  been  written  to  implement  easily  other 
surface  formulations,  such  as  those  reported  in  [9].  Also,  the  program  is  written  to  efficiently 
fill  the  matrix.  The  properties  of  symmetry  in  reference  to  the  impedance  and  admittance 
matrices  that  are  used  to  build  the  final  matrix  are  implemented.  All  the  numerical  results 
presented  in  this  paper  are  obtained  by  the  same  program  (MRBOR)  to  show  its  flexibility  and 
generality. 

II  DEVELOPMENT  OF  THE  SIE 

a.  Statement  of  the  Problem: 

In  this  section,  the  concept  of  the  equivalence  principle  is  used  to  derive  the  SIE 
formulation  for  composite  scatterers  with  N  homogeneous  regions.  The  geometry  and  notations 
for  such  a  scatterer  are  given  in  Fig.  1.  The  whole  space  is  divided  into  N-f  1  homogeneous 
regions  with  permittivities  e,  and  permeabilities  /q,  i  =  0,  1,  2,  ...,  N.  Lossy  materials  are 
considered  by  allowing  ep  nr  i  =  1,2,  ...,  N  to  be  complex.  Some  homogeneous  regions  are 
considered  to  be  perfect  conductors.  The  region  Vj  is  surrounded  by  a  closed  surface  S;  and 
recognized  by  the  inward  normal  unit  vector  n.  The  surface  interface  between  regions  V;  and 
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Vj  is  Sjj,  i  ^  j.  Thus,  Sj  is  the  set  of  all  interface  surfaces  S1J5  where  j  represents  all  region 
numbers  interfacing  with  region  Vr  Note  that  Sy  is  the  same  surface  as  S-;  however,  the 
normal  unit  vectors  rtj  and  nj  are  in  opposite  directions  to  each  other  on  Sy. 


V0  £0^0  Sq2 


b.SIE  Formulation: 

The  total  fields  in  each  homogeneous  region  are  denoted  by  E,  and  Hj,  i  =0,  1,2,..., 
N  for  the  electric  and  magnetic  fields,  respectively.  If  Vj  is  a  perfectly  conducting  region,  the 
fields  are  equal  to  zero.  In  the  free-space  region  V0,  the  total  fields  (Eg,  H0)  are  the  summation 
of  the  incident  and  scattered  fields  (Einc  +  E‘sc,  H,nc  +  Hsc).  From  Maxwell’s  equations  and 
the  equivalence  principle,  one  can  express  the  field  in  each  region  in  terms  of  unknown  electric- 
and  magnetic-equivalent  surface  currents.  In  this  papier,  the  sources  of  excitation  are  considered 
to  be  due  to  a  plane  wave  in  the  free  space  region;  therefore,  the  fields  at  any  observation  point 
r  in  the  free  space  can  be  expressed  as  [1]  (these  expressions  are  given  here  for  convenience) 

0(r)  Eo(r)  =  Emc  -  L^r')  -  K^Mod*')  (1) 

0(r)Ho(r)  =  H,nc-4Q  J0(r')  - ( l/^)  M0(r')  (2) 

and 
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(3) 


H r)  E,(r)  =  -L^.  Ji(r')+4i  Mj(r') 

0( r)  H,(r)  =  -k‘S]  J^r')  -  (l/rjf)L‘.  M^r')  (4) 


in  the  region  Vj,  where  i  =  1,  2,  ...  N.  Time  variation  of  e,wt  is  implied  and  suppressed 
throughout.  The  electric-  and  magnetic-surface  currents  along  the  boundaries  are 


Ji  =  nt  x  Hi 
Mj  =  -rij  x  Ei 


on  Si 


(5) 


In  these  equations, rjj  =  t)q  ■j^rU-n  ,  e,  =  e0  ein  =  MoMjr>  €ir  and  M,r  are  the  relative 

permittivity  and  permeability  in  the  region  Vj,  and  r?0  is  the  intrinsic  impedance  of  the  free 
space.  The  operators  L's  and  k's  are  defined  as 

L^c.cr')  =  javq  Js  [Ci(r')  +  (l/a>2elMj)  VV/.Ci(r/)]4>ldS/  (6) 

4;  C,  (r')  =  js  C,  (r')  x  V  *i  ds'  (7) 

where  CjCr^)  represents  the  currents  Jj  or  M,.  For  r  =  r/,  the  operators  are  interpreted  as 

Cauchy  principal-value  integrals.  4>j  is  the  Green’s  function  of  unbounded  region,  which  can 
be  represented  as 

4>,  (r  -  r')  =  e_jkilr  ‘  r/,/|r  -  r'|  (8) 


where  k,  is  the  wave  number  of  the  region  i,  which  is  equal  to  .  In  Equations  (1)  to 

(4)  the  value  of  0(r)  is  constant,  depending  on  the  position  of  r  as 


0(r) 


1  for  r  e  Vj 
-  1/2  for  r  e  Sj  • 
0  elsewhere 


(9) 
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Applying  the  boundary  conditions  on  each  S(J  yields  a  set  of  coupled  integral  equations 
for  the  unknown  electric  and  magnetic  currents  on  these  surfaces.  On  the  dielectric  interfaces, 
the  tangential  fields  are  continuous.  Thus, 

Eilun  -  EjU  on  S(j  (10) 

ni  x  Hj  =  n -  x  Hj  on  Sjj  (11) 

On  the  conductor  interfaces,  the  tangential  electric  fields  vanish,  yielding 

Ej  |tan  =  0  on  Sjj  (conductor  surfaces)  (12) 


To  enable  the  reader  to  have  better  understanding  of  Equations  (13)  to  (15)  (which  yield 
a  coupled  system  of  integral  equations),  a  specific  example  will  be  considered.  Consider  a 
scatterer  consisting  of  two  dielectric  regions  attached  to  a  perfectly  conducting  body,  all  in  free 
space,  as  shown  in  Fig.  2.  The  boundary  conditions  in  integral  forms  are 
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Fig.  2.  Example  of  a  three  regions  object. 


[Ls0  Jo  -  4  Mo  -  4  Jl  *  4,  M|  It.™  =  C  O"  So, 


"0  x  [«s0Jo  *  d^o)  ls0Mo1  "  n°  X  [KS,JI  *  LS,  M,j 


=  nQ  x  Hinc  on  S0i 


Jo  =  (Job  Jo2’  Jo3  ) 
Jl  =  ("Job  J 12’  J13) 
Mo  =  (Mob  M02) 
M]  =  (-M0I,  Mj2) 


(19) 


lLS0  Jo  -  40  M0  -  4,  h  +  4  M2  ]tan  =  C  on  S02 

Hq  x  [k°Sq  J0  +  (1/77^)  L^Mq]  -  n0  x  [^J2  +  (l/i||)  M2] 

=  nQ  x  Hinc  on  S02 

where 


J2  =  (~Jo2>  J 12*  J23  ) 

M2  =  (-M02'  -Mj2) 

(21) 

r,  0  .  0  ..  ,  ninc  ~ 

[LSo  Jo  Ks0  M0  lean  =  E'tan  0n  ^03 

(22) 

(Ls.  J, 

-  4,  mi  -  i-i  h  *  «i  m2  i,an 

-C  onS,2 

(23) 

n,  x  [*S) 

j,  * dV;) l'm,i  - iii  x  i4,j2 

-  (1/4)  4  M2J 

(24) 

=  n|  x  Hinc  on  S12 

[Ls,  ■![  -  4,  M,  ]„„  =  C 

on  S13 

(25) 

[Li  h  -  4,  M2  lean  -  C 

on  S23 

(26) 

These  are  nine  vectorial  equations  in  nine  vectorial  unknowns.  The  unknowns  are  the 
electric-  and  magnetic-surface  currents  J0j,  Jq2-  Jo?-  Ji2'  J13-  J23-  M01,  ^02-  an^  ^12-  The 
integral  equations  are  reduced  to  matrix  equations  using  the  method  of  moments  by  using 
triangle  testing  and  triangle  expansion  functions  [8]. 
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The  above  surface  integral  equations  are  applied  to  rotationally-symmetric  bodies.  The 
reduction  of  the  integral  equations  to  matrix  equations  involving  unknown  surface  currents 
follows  a  well-known  procedure  [9].  After  the  necessary  manipulations  of  the  method  of 
moments,  the  general  matrix  takes  the  form 

[  T  ]„  [I]„  =  [V]„  (27) 


where  T  is  a  square  matrix,  representing  a  combination  of  impedance  and  admittance 
submatrices  that  are  given  in  [9];  In  is  a  column  matrix  for  the  unknown  expansion  coefficients 
of  the  unknown  current  components  J  and  M;  and  Vn  is  the  excitation  column  matrix.  Once  the 
matrix  is  solved,  the  induced  currents  on  all  surface  interfaces  can  be  determined.  Scattered  far 
fields  can  be  determined  from  the  induced  currents  on  the  outer  surfaces. 

Ill  RESULTS  AND  DISCUSSION 

a.  Bistatic  RCS: 

In  this  section  the  bistatic  RCS  is  computed  numerically.  The  numerical  results  are 
verified  by  comparison  with  analytical  and  supplemental  numerical  results. 

Dielectric  sphere:  First,  to  verify  the  numerical  results  of  objects  made  of  more  than  one 
homogeneous  region,  the  example  of  a  sphere  geometry  is  considered.  The  series  solution  for 
a  dielectric  sphere  [10]  having  er=  4-j0.5,  iir=  2-j0.25,  and  ka=3  is  obtained  and  compared 
with  the  numerical  solution  of  the  same  object  when  it  is  divided  into  three  homogeneous  regions 
of  the  same  material  type.  The  results  obtained  from  both  solutions  must  be  identical,  because 
physically  both  cases  represent  the  same  object.  The  agreement  between  the  two  solutions  is 
excellent,  as  shown  in  Fig.  3.  Several  spherical  objects  have  been  considered,  such  as  the 
coated  sphere  and  the  multilayered  sphere;  however,  their  results  have  been  omitted  for  brevity. 
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00 

Fig.  3.  Bistatic  RCS  of  dielectric  sphere  divided  into  three  regions  of  the  same  materials, 

er=4-j0.5,  /xr= 2-j0.25,  and  ks=3. 


Dielectric  toroid:  Another  geometry,  of  a 
dielectric  toroid,  is  considered,  as  shown  in  Fig. 4a. 
The  numerical  solution  is  obtained,  once,  when  the 
circular  cross  section  is  divided  into  six  regions,  all 
of  which  have  the  same  type  of  material.  This 
numerical  solution  is  compared  with  the  numerical 
solution  of  the  same  object  when  it  is  treated  as  one 
region.  The  agreement  between  both  solutions  is 
excellent,  as  shown  in  Fig. 4b.  When  each  region  is 
filled  with  different  homogeneous  materials,  the 
computed  bistatic  RCS  is  shown  in  Fig.  5.  Notice 


«2 


Fig.  4a.  Toroid  cross  section  divided 
into  six  homogeneous  regions. 


that  the  scattering  level  has  increased  in  the  whole  bistatic  range. 
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Fig.  4b.  Bistatic  RCS  of  the  toroid  when  all  regions  have  the  same  materials, 
er=4-j0.5,  /zr= 2-j0.25,  ka=3,  kb=4.4. 


Fig.  5.  Bistatic  RCS  of  the  toroid  with,  erJ  =3-j3,  er2=2-j2,  er3=4-j0.5,  cr4 =6-j6, 
e  s=5-j5,  e  6=4-j4,  and  n  =2-j0.25  in  all  regions. 


Loaded  conducting  bicone:  To  examine  the  numerical  solution  performance  of  objects 
made  of  dielectric  and  conductors,  the  case  of  a  biconical  conducting  object  is  selected.  The 
numerical  solution  of  this  object  and  the  conducting  bicone  is  compared  with  the  numerical 
solution  of  a  conducting  bicone  surrounded  by  an  artificial  dielectric  material  of  free-space 
permittivity  and  permeability,  as  shown  in  Fig.  6.  In  this  example,  the  bicone  is  illuminated  by 
a  plane  wave  of  normal  incidence  (0=90°).  Therefore,  the  number  of  the  azimuthal  modes  that 
is  used  to  obtain  the  bistatic  RCS  is  eleven,  n=0,...,  ±5;  for  the  axial  incident  cases  the 
required  modes  are  +.1.  Figure  6  shows  the  excellent  agreement  between  the  two  solutions. 
Also  shown  is  the  symmetry  around  0=90°.  In  this  example,  the  H-plane  shows  a  zero-back 
and  forward-scattering,  and  the  E-plane  shows  its  maximum  values  at  these  directions.  When 
the  free-space  part  of  this  object  is  filled  with  homogeneous  materials  of  er  =  4,  the  RCS  is 
computed  as  shown  in  Fig.  7.  The  sidelobes  of  the  scattered  H-plane  patteren  disappear,  and 
the  forward  scattering  for  the  E-plane  pattern  becomes  higher  than  the  back-scattering. 


f  f 
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Fig.  7.  Bistatic  RCS  of  the  bicone  when  er=4.,  nr=l. 


b.  Monostatic  RCS: 

In  the  above  examples,  the  bistatic  RCS  were  considered.  Next  monostatic  RCS  are  also 
verified  numerically  and  experimentally. 

Half-conducting  sphere:  The  first  example  is  the  perfectly-conducting  dielectric 

hemisphere.  For  numerical  verification,  the  dielectric  is  considered  as  a  free  space.  The 
monostatic  RCS  is  compared  with  the  numerical  solution  of  a  conducting  hemisphere,  as  shown 
in  Fig.  8.  This  figure  indicates  excellent  agreement  between  both  solutions  in  the  whole  6 
range.  When  the  free-space  part  is  replaced  by  materials  of  er=2-j0.5,  and  ^r=3-j0.5,  the 
monostatic  RCS  is  computed  as  shown  in  Fig.  9.  The  effect  of  the  presence  of  dielectric  materials 
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Fig.  8.  Monostatic  RCS  of  a  conducting  hemisphere  compared  with  the  scattering  from 
a  conducting  hemisphere-air  hemisphere  object,  ka=3. 


Fig.  9.  Monostatic  RCS  of  a  conducting  hemisphere-dielectric  hemisphere  object, 

ka  =  3,  er=2-j0.5,  /xr=3-j0.5. 
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is  different  from  the  free  space.  This  difference  is  clear  in  the  backscattering  data  for  ranges 
of  0<9O,  which  is  the  dielectric  side.  Significant  reduction  of  the  backscattering  level  is 
observed.  When  er=l,  the  effect  of  the  flat  surface  and  the  sharp  edges  obviously  contributed 
to  the  high  backscattering  levels. 

Round-tip  cone:  In  this  section,  new  numerical  results  are  presented  and  verified  with 
measurements,  which  have  been  collected  at  the  RCS  Range  of  Group  95  at  Lincoln  Laboratory. 
The  first  geometry  that  is  considered  is  the  partially  coated,  perfectly  conducting  round-tip  cone, 
shown  in  Fig.  10.  The  first  case  that  has  been  considered  is  the  one  that  corresponds  to  a 
homogeneous  coating;  in  other  words,  the  materials  in  both  coating  regions  are  the  same.  The 
monostatic  RCS  are  shown  in  Fig.  1 1  for  both  66  polarization  (left)  and  <£<£  polarization  (right), 
respectively.  When  the  coating  on  the  second  region  is  removed,  the  second  region  is  then 
equivalent  to  free-space  permittivity.  The  monostatic  RCS  is  computed  and  compared  with  the 
measurements  as  shown  in  Fig.  12  for  66  polarization  and  <f><f>  polarization,  respectively.  Figure 
13  show  the  monostatic  RCS  of  both  polarizations  when  second  region  is  filled  with  materials 
of  er=2.60.  In  the  above  three  cases,  excellent  agreement  is  obtained  between  the  measured 
data  and  the  computed  data. 

T 
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f 

b 

^ _  L  _ 

All  dimensions  in  Xq  (free  space  wavelength) 

L  =  1.978.  b  =  0.2328,  a  =  0.0424,  t  =  0.0847  c  =  0.86 
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Fig.  10.  Geometry  of  a  partially  coated,  perfectly  conducting  round-tip  cone. 
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Fig.  11.  Computed  and  measured  monostatic  RCS  of  easel  in  Fig.  10. 


00  00 

Fig.  12.  Computed  and  measured  monostatic  RCS  of  case2  in  Fig.  10. 


a  (dBsm) 


Fig.  13.  Computed  and  measured  monostatic  RCS  of  case3  in  Fig.  10. 


Fig.  14.  Geometry  of  a  perfectly-conducting  cylinder  partially  coated  with  two 

dielectric  layers 


Finite  cylinders:  Another  geometry,  double  layered  coating  of  a  finite  conducting 
cylinder,  is  considered.  Figure  14  shows  the  geometry  of  the  cylinder.  The  values  of  the 
parameters  of  Fig.  14  are  given  in  Table  I  for  three  cases.  All  the  measurements  in  this  section 
have  been  performed  at  4GHz.  Figure  15  (left)  shows  the  comparison  between  the  computed 
and  measured  data  of  the  monostatic  RCS  for  66  polarization  of  easel  in  Table  I.  In  this  case 
only  one  layer  is  considered. 


Table  I.  Parameters  of  Figure  14 


case 

a(cm) 

L(cm) 

tj(cm) 

frl 

t2(cm) 

fr2 

D 

3.0 

27.94 

0.1528 

35.2- J28.68 

- 

- 

2 

3.0 

27.94 

0.1528 

35.2-j28.68 

0.68 

1.757-j  1.569 

3 

2.814 

21.59 

0.988 

2.05 

0.1528 

35.2-j28.68 

The  measured  data  of  </><£  polarization  are  not  available,  however,  the  computed  data  are  shown 
in  Fig.  15  (right).  The  monostatic  RCS  of  the  second  case  in  Table  I  of  the  two  layered  coating 
is  shown  in  Fig.  16.  Again,  only  the  measurements  of  the  60  polarization  are  available.  In  the 
last  case,  the  first  layer  is  a  high  loss  material  of  large  permittivity  and  the  second  one  is  high 
loss  of  small  permittivity.  In  the  third  example  of  Table  I,  the  first  layer  is  made  of  a  lossless 
material  of  small  permittivity;  the  second  layer  is  made  of  high-loss  high-permittivity  materials. 
The  monostatic  RCS  is  shown  in  Fig.  17  for  the  66  and  <f>4>  polarization,  respectively.  The  last 
three  cases  represent  large  objects.  The  agreement  between  the  measured  and  computed  data 
is  satisfactory.  In  general,  the  above  results  show  accuracy  of  the  measurements  within  a  wide 
dynamic  range  of  sensitivity. 


20 


a  (dBsm)  a  (dBsm) 


00 

Fig.  16.  Computed  and  measured 
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Fig.  17.  Computed  and  measured  monostatic  RCS  of  case3  in  Table  I. 


IV  CONCLUSION 

In  this  paper,  a  computer  program  has  been  developed  to  compute  the  electromagnetic 
scattering  from  axisymmetric  objects.  The  method  of  moments  is  used  to  solve  the  surface- 
integral  equations  formulation  (E-PMCHW).  A  number  of  objects  has  been  analyzed  by  this 
program.  Each  object  consisted  of  arbitrarily  arranged  homogeneous  regions.  Both  bistatic  and 
monostatic  RCS  data  were  presented.  The  computed  data  were  verified  either  numerically  or 
experimentally  and  excellent  agreement  was  demonstrated. 
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Abstract 

In  the  microwave  case  the  physical  optics  (PO)  method  is  frequently  used  for 
the  analysis  of  complex  structures  which  are  modeled  by  flat  plates  of  trian¬ 
gular  or  quadrangular  shape.  The  study  of  the  radar  cross  section  (RCS)  of  an 
isolated  panel,  however,  reveals  deviations  from  experimental  results  which 
are  due  to  edge  diffraction  effects  not  considered  by  PO.  In  order  to  correct 
the  PO-field  by  an  additive  field  term,  the  equivalent  fringe  currents  (EC) 
of  Michaeli  have  been  used  to  derive  the  backscattering  matrix  of  an  isolated 
edge.  By  adding  the  matrices  of  the  individual  edges  to  the  PO-matrix  the  RCS 
of  a  square  flat  plate  with  zero  and  finite  thickness  is  analysed  and  the 
result  is  compared  with  measurements.  The  efficiency  of  the  method  is  demon¬ 
strated  for  objects  modeled  by  a  higher  number  of  panels  and  edges,  namely  a 
cylinder  and  a  double  dihedral.  All  computations  were  performed  with  the  com¬ 
puter  code  SIG5  of  the  Institute. 


1 .  Introduction 

Since  several  years,  the  computer  program  SIG5  is  applied  in  the  Institute 
for  Radio  Frequency  Technology  for  the  prediction  of  the  RCS  of  structures 
which  are  complicated  in  shape  and  large  compared  to  the  wavelength.  SIG5, 
based  on  PO,  is  capable  of  analysing  perfectly  and  imperfectly  conducting 
structures,  including  double  reflections.  The  targets  are  modeled  by  panels 
of  triangular  and  quadrangular  shape,  see  Fig.  1.1.  The  hidden  surface  pro¬ 
blem  inherent  with  PO  is  solved  by  an  exact  construction  of  the  shadow  boun¬ 
dary  for  each  panel,  whose  size  is  only  limited  by  the  admissible  deviation 
between  the  true  surface  and  the  model  surface.  SIG5  is  organized  in  a  very 
similar  way  to  the  computer  code  RECOTA,  developed  by  the  Boeing  Aerospace 
Company,  Seattle  [1].  SIG5  has  been  successfully  tested  for  a  series  of  per¬ 
fectly  conducting  basic  structures  such  as  a  sphere,  cylinder,  cube,  circular 
disk  and  a  double  dihedral  [2-4].  Also,  more  complex  bodies  like  a  periscope 
structure  have  been  analysed  with  promising  results  [5] . 


Fig.  1.1 

Panel  model  of  an  air¬ 
plane  for  the  applica¬ 
tion  of  PO. 
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Despite  the  good  results  which  are  generally  available  with  PO,  there  are 
special  cases  (certain  structures,  specific  pattern  cuts,  selected  polari¬ 
zations)  where  deviations  of  practical  significance  between  computational  and 
experimental  results  are  observed.  They  frequently  can  be  explained  by  the 
feature  of  PO  to  treat  the  influence  of  an  edge  only  as  the  geometrical  boun¬ 
dary  of  a  panel  thus  neglecting  physical  edge  diffraction  effects. 

The  choice  of  a  theory  which  takes  into  account  edge  diffraction  effects  is 
influenced  by  the  following  viewpoints: 

a)  Since  SIG5  is  a  very  comprehensive  computer  code  each  extension  should 
cause  a  minimum  of  changes.  Therefore,  theories  are  preferred  which  are 
able  to  correct  the  PO-solution  by  an  additive  term. 

b)  In  computing  the  RCS  of  large  and  complex  objects  modeled  by  numerous  pa¬ 
nels  the  computer  effort  increases  considerably.  Therefore,  theories  which 
need  a  high  computer  effort  are  not  favoured. 

c)  Since  edges  of  arbitrary  length,  wedge  angle  and  orientation  in  space  oc¬ 
cur  in  the  target  model,  the  theory  should  not  present  a  solution  for  a 
specific  panel,  rather,  the  diffracted  field  of  an  isolated  edge  must  be 
described  by  the  specific  edge  parameters  alone. 

Bearing  these  points  in  mind,  only  asymptotic  theories  come  into  question, 
which  either  describe  the  difference  between  the  total  wedge  diffracted  field 
and  the  PO-field  directly  [6,7]  or  which  evaluate  fringe  currents  flowing 
along  the  edge  and  generating  the  same  difference  rield  by  evaluating  the  ra¬ 
diation  integral  over  the  length  of  the  edge  [6,  9] . 

In  this  paper,  the  second  theory  is  followed.  Here  the  fringe  currents  given 
in  [9]  are  preferred,  since  they  are  valid  for  arbitrary  aspects  of  observa¬ 
tion.  On  the  basis  of  these  currents,  the  backscattering  matrix  for  an  isola¬ 
ted  edge  is  derived. 

In  the  following  section  the  theoretical  background  is  discussed.  In  section 
3,  the  theory  is  applied  for  the  RCS-analysis  of  a  panel.  In  section  4,  the 
results  for  a  cylinder  and  a  double  dihedral  are  presented.  Section  5,  final¬ 
ly,  summarizes  some  conclusions  on  the  basis  of  the  preceeding  analysis. 


2.  Theoretical  Background 

The  results  of  the  theories  used  in  the  computer  code  SIG5  are  expressed  by 
the  backscattering  matrix 


(2.1)  [T] 


11 

1 1 2 

21 

t-22- 

which  relates  the  cartesian  components  of  the  scattered  field  Es  to  those  of 
the  incident  field  Ee: 


^sx 

^ex 

-  [T] 

.^sy. 

-E'ey. 
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The  propagation  direction  of  the  scattered  field  is  given  by  the  z-axis 
(observer  fixed  coordinate  system) .  From  the  complex  elements  t± j  of  the 
scattering  matrix  the  polarization  dependent  RCS  is  computed: 


(2.3)  o ij  =  lim  (4nr2  tij  tij)  , 

where  r  is  the  distance  between  the  radar  observer  and  the  test  object.  The 
RCS  referred  to  1  square  meter  and  expressed  in  decibels  yields  the  quantity 
dBsm,  which  is  used  in  this  paper  to  compare  theoretical  and  experimental  re¬ 
sults  for  the  selected  transmitting/receiving  polarizations. 

The  total  scattering  matrix  [T]  of  a  complex  body  can  be  thought  to  be  compo¬ 
sed  of  a  matrix  [Tpo]  which  is  the  sum  of  the  scattering  matrices  based  on  PO 
of  the  individual  N  panels  and  the  scattering  matrix  [Tf]  which  sums  up  the 
scattering  matrices  of  the  M  edges  of  all  panels: 


(2.4)  [T]  =  [Tpo]  +  [Tf]  , 


N 

[Tp°]  =  J  [Tgo] 
n 


M 


[Tf ]  =  } 
m 


[t£] 


The  PO-scattering  matrix  of  an  individual  panel  of  zero  thickness  and  perfect 
conductivity  is  readily  evaluated  using  the  radiation  integral 


(2.5) 


23  (r) 


=  jk  e~^r 
4jc  r 


|  (sx  (sxZ?F  (r' ) )  )e--^s  "r,df '  , 
fp 


and  introducing  the  surface  current 


(2.6)  3F(r') 


2nx (expe) Ee/Z  e  r  \  .,  /illuminated  \  parts  of 

.0  )  on  the  (shadowed  )  the  panel. 


The  geometrical  parameters  r,  r' ,  e,  s,  Fp  and  n  are  explained  in  Fig.  2.1, 
while  the  electric  parameters  are  given  by  \  =  wavelength,  k  =  2 n/X  =  wave 
number,  Z  =  wave  impedance  of  the  propagation  medium,  pe  =  px  ex  +  py  ey  = 
unit  polarization  vector  and  Ee  =  magnitude  of  the  incident  electric  field. 


Fig.  2.1 

Geometrical  scheme  for  the 
interpretation  of  the  ra¬ 
diation  integral. 


26 


The  evaluation  of  the  radiation  integral  for  the  backscattered  (e  =  -s)  field 
in  the  observer  fixed  cartesian  coordinate  system  and  the  comparison  with 
(2.2)  results  in  the  following  backscattering  matrix: 


(2.7)  [Tpo] 


jk  e~^kr  j*  j2kz' 

2n  r  J  e 

fp 


dx' dy' 
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For  a  panel  of  polygonal  shape  the  phase  integral  can  be  solved  analytically. 
Details  of  the  integration  are  given  in  [4]  for  panels  of  triangular  and 
quadrangular  shape  which  are  used  in  the  computer  program  SIG5  to  model  the 
target . 


Normalizing  the  scattering  matrix  to  e~i*r/r,  the  matrix  appears  purely  ima¬ 
ginary.  Independent  of  the  orientation  of  the  panel,  no  differences  between 
xx-polarization  and  yy-polarization  nor  any  cross-polarization  are  predicted 
by  PO.  This,  however,  is  only  true  if  the  panel  is  perfectly  conducting  and 
no  double  or  multiple  interactions  can  occur  between  pairs  or  a  higher  number 
of  panels. 


The  electric  and  magnetic  fringe  currents  of  Michaeli  [9]  can  be  written  in  a 
very  compact  way  using  the  coefficients  D|,  d£,  D|m  (these  symbols  are  propo¬ 
sed  by  Knott  [10] ) : 

j2Eet  j2Het 

(2.8)  If  (Ve/Vs^'Pef  Ps)  - Di(\|fe,Vs;Pe/Ps) - Dim  < Ve /  Vs Pef  P s )  , 

kZsin2pe  ksin2pe 


(2.9)  Mf  (Ve,Vs»'t  J,PS) 


]2ZHet 

ksinpesinps 


Dm ( Vef Vs ' Pe' Ps ) 


Eet  =  t-£e  or  Het  =  t*Ue  is  the  component  of  the  incident  electric  or 
magnetic  field  parallel  to  the  edge  with  unit  tangent  vector  t.  The  geo¬ 
metrical  parameters  ye,  \ys,  pe,  ps  and  n  are  explained  in  Fig.  2.2.  The 
formulas  for  D|,  Dl  and  D|m  are  given  in  the  appendix  for  the  backscatte¬ 
ring  Case  ( Ve  =  Vs'  Ps  =  Ji  -  Pe)  • 


plane  of 
Incidence 


Fig.  2.2 
Wedge  geometry. 
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Before  choosing  the  fringe  currents  of  Michaeli,  it  was  necessary  to  relate 
this  theory  to  other  theories,  which,  in  principle,  could  be  used  to  solve 
edge  diffraction  problems  observing  the  viewpoints  given  in  the  introduction. 

Though  there  are  several  papers  [10  -  15]  dealing  with  the  relationships  be¬ 
tween  asymptotic  diffraction  theories,  it  is  useful  to  summarize  the  results 
with  the  aid  of  a  few  statements.  Thereby  the  output  of  the  different  theo¬ 
ries  are  given  in  the  form  of  currents.  For  mathematical  details  see  [15]. 

The  fringe  currents  of  Michaeli  can  be  written  in  the  following  way: 

(2.8a)  If  =  I11  -  Ipo  , 

(2.9a).  Mf  =  Mfc  -  Mpo  , 

where  Ifc,  M*1  are  the  equivalent  electric  and  magnetic  total  currents,  respon¬ 
sible  for  the  total  wedge  diffracted  field  while  Ipo  and  Mpo  are  the  equiva¬ 
lent  PO-components  responsible  for  the  PO-field  of  the  wedge.  The  expressions 
derived  by  Michaeli  are  obtained  by  asymptotic  end-point  evaluation  of  the 
fringe  current  radiation  integral  over  the  ray  coordinate  measured  along  the 
diffracted  ray  grazing  the  surface  of  the  local  wedge.  The  resulting  expres¬ 
sions  are  finite  for  all  aspects  of  illumination  and  observation,  except  for 
the  special  case  where  the  direction  of  observation  is  the  continuation  of  a 
glancing  incident  ray  coming  from  outside  the  wedge.  This  situation  occurs 
only  in  forward  scattering. 

Choosing  the  coordinate  for  the  radiation  integral  in  a  traditional  way  nor¬ 
mal  to  the  edge,  expressions  for  the  fringe  currents  are  obtained  [16]  which 
display  infinities  for  certain  combinations  of  observation  and  incidence  di¬ 
rections.  These  fringe  currents  are  identical  to  those  which  can  be  evaluated 
[15]  from  the  fringe  diffracted  fields  of  Mitzner  [7].  Now  adapting  the  frin¬ 
ge  currents  to  the  cone  of  diffracted  rays  one  would  expect  that  these  would 
be  identical  to  those  which  can  be  derived  from  the  fringe  field  expressions 
of  Ufimtsev  [6]  and  which  are  used  in  [1],  This  is  the  case  for  the  electric 
total  current,  for  the  magnetic  total  and  PO-current  but  in  general  not  for 
the  electric  PO-current. 

The  total  currents  are  identical  to  the  filamentary  currents  of  Keller's 
GTD-field  [17],  represented  by  Knott  and  Senior  in  a  compact  manner  [18]  and 
denoted  as  equivalent  currents  by  Ryan  and  Peters  [19]  who  applied  the  con¬ 
cept  to  compute  the  edge  diffracted  field  in  caustic  regions.  Evaluating  the 
equivalent  currents  of  the  GTD  for  vertical  incidence  (j)e  =  90°)  on  a  half¬ 
plane  (n  =  2)  one  receives  the  filamentary  currents  of  the  diffracted  field 
derived  by  Sommerfeld  [20] . 

The  difference  in  the  electric  PO-currents  is  given  by  a  coupling  term  be¬ 
tween  the  incident  electric  field,  being  perpendicular  to  the  plane  of  inci¬ 
dence,  and  the  PO-field  having  a  component  parallel  to  the  plane  of  diffrac¬ 
tion.  This  coupling  term  is  not  taken  into  account  by  the  theory  of  Ufimtsev 
as  was  pointed  out  in  [14].  Only  in  the  case  that  both  faces  of  the  wedge  are 
iLluminated  by  the  incident  wave  does  the  coupling  term  become  identical  to 
zero  [15]  and  the  fringe  currents  extracted  from  the  theory  of  Ufimtsev 
become  identical  to  the  fringe  currents  of  Michaeli. 
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Introducing  now  the  fringe  currents  (2.8)  and  (2.9)  in  the  radiation  integral 
over  a  filamentary  current  with  length  L 


(2.10)  £i(r)  =  jji  e  ]-r-  f  (Z  lf (?')  (sx(sxt))  +  Mf  (r' )  (sxt) )  e^3'1’  dl'  , 

2  it  r  J 

L 

one  arrives  at  the  backscattering  matrix 


(2.11)  [Tf ] 


1_ 

2k 


e 


j2ks • r' 


dl' 


X 


D|  t|  +  D£  t*  -  D|m  tx  ty 

(De  “  Dm)  tX  ty  “  D|m  ty 


(Dl  -  Dm)  tx  ty  +  Dim  ti 

Dm  tX  +  Di  ty  +  Dim  tX  ty 


The  geometrical  parameters  are  explained  in 


Fig.  2.3. 


Fig.  2.3 

Geometrical  sketch  to  evaluate 
the  radiation  integral. 


As  in  the  case  of  the  PO-scattering  matrix  the  phase  integral  can  be  solved 
analytically.  Introducing  the  mid-point  vector  rM  of  the  edge  one  obtains 


(2.12)  j*  e j2ks • r'  dl, 
L 

with  tz  =  -cospe. 


L 


sin  (kLt-J 
kLtz 


e j2krMz 
- 2 — 

l-tz 


t 


Normalizing  the  backscattering  matrix  [Tf]  by  e_jkr/r  it  appears  purely 
real.  In  general  differences  between  HH-  and  VV-polarization  are  predicted. 
Also,  since  [Tf]  is  asymmetric,  differences  between  HV-  and  VH-polariza- 
tion  are  predicted  which  is  not  correct  for  monostatic  scattering  processes. 
This  is  due  to  corner  diffraction  effects  which  arise  with  an  edge  of  finite 
length  and  which  are  not  considered  in  the  theory.  Symmetry  is  observed  for 
the  special  case  of  normal  wave  incidence  (Pe  =  90°).  Further,  if  the  contour 
of  a  panel  is  a  smooth  curve,  the  matrix  [Tf]  would  be  symmetric. 


For  the  discussions  in  the  next  section,  some  properties  of  the  coefficients 
D|,  Dm,  D|m  are  needed,  which  are  summarized  in  the  following.  Fig.  2.4  shows 
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the  coefficients  for  normal  incidence  ((Je  =  90°)  and  a  half-plane  (n  =  2)  de¬ 
pending  on  the  angle  ye.  In  Fig.  2.5  the  coefficients  for  a  step  (n  =  1.5) 
are  represented  where  Djm  is  identical  to  zero.  In  both  cases,  mirror  sym¬ 
metries  for  each  coefficient  become  relevant  about  ye  =  180°  for  the  half¬ 
plane  and  ve  =  135°  for  the  step.  Further  symmetries,  now  radial  symmetries, 
occur  for  the  half-plane  between  coefficient  Dj  and  d£  about  ye  =  90°  and 
Ve  =  270°. 


Fig.  2.6  shows  the  coefficients  D|,  d£  and  D|m  for  ye  =  270°  and  a  half-plane 
now  dependent  of  the  angle  pe.  One  also  finds  symmetries,  namely  mirror- 
symmetry  for  D|  and  d£  and  radial-symmetry  for  D|m  about  pe  =  90°.  These  sym¬ 
metries  occur  for  all  ye  =  const. 


Fig.  2.4  Fig.  2.5 

Coefficients  d£  ( - )  and  d£  ( - )  for  Coefficients  Dl  ( - )  and  d£ 


a  half-plane  (n=2)  depending  on  fe;  0, 

i.o - — — — 


-0.8  ■  ■  • 

-1.0  -  -  *  *'■  . 

o  30  60  90  120  150  180 

ffe  {deg  )  — — 1 ■ 


=90°.  ( - )  for  a  step  (n=1.5)  de¬ 

pending  on  *e;  pe=90°. 


Fig.  2.6 

Coefficients  Df  ( - ) ,  d£ 

( - )  and  D|m  (-.-)  for  a 

half -plane  depending  on  pe; 
fe  =  270°. 


3 .  Analysis  of  the  RCS  of  a  Square  Test  Panel  by  the  PO-  and  EC-Method 

Studies  similar  to  the  ones  reported  here  were  performed  by  Ross  [21],  Sikta, 
Burnside,  Chu  and  Peters  [22],  Balanis,  Griesser  and  Marsland  [23],  Volakis 
and  Ricoy  [24],  Pelosi,  Tiberio,  Puccini  and  Maci  [25],  Ivrissimtzis  and  Mar- 
hefka  [26].  Ross  applied  the  GTD  up  to  triple  diffraction  to  compute  the  RCS 
of  rectangular  flat  plates.  Sikta  et  al.  used  a  modified  equivalent  current 
concept  based  on  GTD  and  a  corner  diffraction  analysis  to  study  the  RCS  of 
flat  plate  structures  such  as  a  square  flat  plate,  a  finlike  plate  and  a 
disc.  Balanis  et  al.  used  GTD  up  to  third  order  diffraction  for  principal 
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plane  and  EC-currents  of  Michaeli  for  off-principal  plane  backscattering  of 
plates.  The  angular  spectrum  method  along  with  the  generalized  matrix  formu¬ 
lation  were  employed  for  the  diffraction  analyses  of  a  thick  perfectly  con¬ 
ducting  half-plane  by  Volakis  and  Ricoy.  Pelosi  et  al.  applied  GTD  to  calcu¬ 
late  the  RCS  of  a  square  plate.  Ivrissimtzis  and  Marhefka  used  a  uniform 
ray  approximation  including  higher  order  terms  for  the  analysis  of  polyhedral 
structures . 

The  procedure  presented  in  the  previous  section  is  not  limited  to  the  simple 
structure  of  a  flat  plate  as  will  be  demonstrated  in  the  next  section.  How¬ 
ever,  in  this  section  the  scattering  from  a  square  test  plate  is  investigated 
in  great  detail  to  estimate  the  efficiency  and  the  limits  of  application  of 
the  chosen  method. 

The  square  test  panel  is  analysed  at  a  frequency  of  16.66  GHz  which  results 
in  a  wavelength  of  17.995  mm.  The  edge  length  is  91.4  mm  (5.08  A.)  and  the 
thickness  is  0.8  mm  (0.044  A).  In  order  to  have  an  independent  external  accu¬ 
racy  check  for  the  following  computations,  the  test  panel  was  chosen  to  be 
equal  to  one  of  the  panels  investigated  by  Ross  [21]  .  The  panel  is  defined  in 
the  observer-fixed  coordinate  system,  see  Fig.  3.1.  The  rotations  around  the 
x,  y  and  z-axis  are  defined  by  the  angles  ipx,  cpy,  <p2.  An  incident  wave  with 
electric  field  vector  parallel  to  the  x-axis  is  designated  as  horizontally 
polarized  while  a  field  vector  parallel  to  the  y-axis  describes  a  vertically 
polarized  field,  if  the  object  is  rotated  around  the  y-axis  for  instance. 


Fig.  3.1  Orientation  of  the  panel  in  the  observer-fixed  coordinate  system 
and  numbering  of  the  edges  for  a  panel  with  zero  thickness. 


Fig.  3.2  resp.  Fig.  3.3  shows  the  experimental  RCS-result  for  horizontal 
resp.  vertical  polarization  and  for  the  principal  plane  (<px=0,  (py=a,  q>z=0)  . 
For  a  =  0°  one  has  normal  incidence,  for  a  =  90°  the  panel  is  seen  under  gra¬ 
zing  incidence.  The  PO-result  for  a  panel  with  zero  thickness,  which  is 
usually  compared  with  experimental  results,  is  shown  in  Fig.  3.4.  The  maximum 
RCS-value  amounts  to  4.33  dBsm. 
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Fig.  3.2 

Experimental  result  for  horizontal 
polarization. 
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Fig.  3.4 

PO-result  for  the  test  panel 
with  zero  thickness. 


Fig.  3.3 

Experimental  result  for  vertical 
polarization. 
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Fig.  3.5 

PO-result  for  the  test  panel 
with  thickness  0.044  X. 


Besides  the  smoothing  of  the  deep  nulls  of  the  PO-result,  the  sidelobe  peaks 
far  from  the  mainlobe  are  significantly  higher  in  the  experiment  than  in  the 
theory.  At  grazing  incidence  a  deep  null  of  -<*>  dB  is  predicted  from  theory. 

In  the  experiment  this  null  occurs  only  for  horizontal  polarization,  while 
for  vertical  polarization  a  level  of  about  -27  dB  under  the  main  lobe  peak 
can  be  observed.  In  view  of  the  experimental  results  the  theoretical  results 
need  improvement. 

However,  it  must  be  emphasized,  that  for  the  experiment  a  panel  with  a  thic¬ 
kness  of  0.044  X  was  used  while  for  the  theory  the  panel  was  assumed  with 
zero  thickness.  Thus  far  the  comparison  is  a  little  unfair.  If  one  models  the 
flat  plate  with  the  same  thickness  as  was  used  for  the  experiment,  then  it 
consists  of  six  faces.  Each  of  them  can  contribute  to  the  total  scattered 
field  if  it  is  illuminated  by  the  incident  field.  The  result  is  presented  in 
Fig.  3.5.  One  can  see  that  the  deep  nulls  are  filled  up,  that  the  peaks  of 
the  distant  sidelobes  are  higher  than  in  the  case  of  zero  thickness,  and 
that,  at  grazing  incidence,  an  RCS  on  the  order  of  -41  dB  under  the  peak  is 
predicted  compared  to  -<*>  dB  for  the  zero  thickness  panel  and  the  -27  dB  of 
the  experiment. 
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In  the  following  the  influence  of  the  edges  is  considered.  First  the  panel 
with  zero  thickness  is  analysed. 

Edges  2  and  4  make  an  angle  pe  =  90°  with  the  incident  ray  independent  from 
(py/  see  Fig.  3.1,  while  (3e  of  edges  1  and  3  varies  with  |5e  =  90°  +  <py.  From 
Eq.  (2.11)  and  Eq.  (2.12),  one  can  expect  an  oscillating  behaviour  of  the  RCS 
of  edges  1  and  3  and  a  monotonic  function  for  edges  2  and  4  with  respect  to 
the  rotation  angle  (py.  For  rotation  angles  0°  <  <py  <  180°,  edge  2  is  nearer 
to  the  radar  observer  than  edge  4,  therefore,  it  is  sometimes  called  the  lea¬ 
ding  edge  while  edge  4  is  the  trailing  edge. 


First  the  influence  of  edges  1  and  3  on  the  RCS  computed  by  PO  is  considered. 
The  scattering  matrix  of  edge  1  is  given  by 

1  jkr  s in  ( kLsintpy )  De  Dem 

(3.1)  [Tf ]  =  -  T—  2 -  L  — rq — : -  , 

1  2n  r  kLsin<py  Q  Df 


with  arguments  ve  =  90°,  (Je  =  90°  +  (py  for  Df,  Df,  and  Dfm.  For  edge  3  the 
following  formula  is  obtained: 


,  1  e_jkl 

(3.2)  [TSl—^h- 


sin  (kLsin<py)  Df  Dem, 


kLsin<py 


0  Df 


with  Ve  =  90°,  Pe  =  90°  -  (Py. 


We  can  make  the  following  observations: 

1.  Both  scattering  matrices  are  purely  real,  after  normalizing  by  the  fac¬ 
tor  e"ikr/r. 


2.  The  maximum  value  is  reached  for  <py  =  0°  (the  incident  ray  hits  the 

plate  normally) .  The  maximum  value  amounts  to  about  -32  dBsm  which  is  ab¬ 
out  -36  dB  under  the  PO-value.  This  means  that  the  peak  value  computed  by 
the  PO-method  is  negligibly  influenced  by  the  diffraction  effects  of  ed¬ 
ges  1  and  3.  Since,  further,  the  RCS  decreases  with  increasing  angle  cpy 
in  an  oscillating  manner,  both  edges  have  practically  no  influence  on  the 
PO-result . 


3.  Since  mirror  symmetries  for  De  and  Dm  hold,  see  Fig.  2.6,  the  copolar 
scattered  fields  of  each  edge  are  equal  in  amplitude  and  phase. 

4.  Both  matrices  indicate  a  cross-polarization  term  and,  therefore,  are 
asymmetric.  Since,  however,  for  the  edge  coefficient  Dfm  a  radial  symme¬ 
try,  see  Fig.  2.6,  holds,  the  sum  matrix  [Tf,3]  =  [Tf]  +  [Tf]  is  sym¬ 
metric  and  is  given  by 

0 

Df 

with  arguments  ve  =  270°,  (5e  =  90°  -  <py.  The  sum  matrix  predicts  diffe¬ 
rences  between  horizontal  and  vertical  polarization.  However,  this  effect 
has  practically  no  influence  on  the  final  result. 
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Fig.  3.6  shows  the  RCS  of  edge  1  and  3  for  horizontal  polarization  and  Fig. 
3.7  for  vertical  polarization.  Fig.  3.8  shows  the  sum  RCS  of  edges  1  and  3 
for  horizontal  polarization.  For  vertical  polarization  only  negligible  diffe¬ 
rences  exist. 


Fig.  3.6 

RCS  of  edge  1  resp.  edge  3  for 
horizontal  polarization. 
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Fig.  3.7 

RCS  of  edge  1  resp.  edge  3  for 
vertical  polarization. 


Fig.  3.8 

RCS  of  edges  1  and  3, 
horizontal  polarization. 


Contrary  to  edges  1  and  3,  edges  2  and  4  generate  contributions  of  signifi¬ 
cant  practical  interest.  Bearing  in  mind  that  D|m(Pe  =  90°)  =  0,  one  obtains 
for  edge  2 


(3.4)  [T|] 


1  e  jkr  jkasinipy 

/*v  ~  L  0 

2  n  r 


0 


0 


The  arguments  of  the  edge  coefficients  are  ye  =  270°  -  <py,  pe  =  90°.  For 
edge  4,  we  get  for  the  parameters  ve  =  270°  +  ipy,  pe  =  90° 


(3.5)  [T$] 


1  e  ^kr  .  _ -ikasin<p„ 

o  L  0  * 

2  7t  r 


0 


0 


The  following  observations  can  be  made: 
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1.  After  normalization  both  scattering  matrices  appear  to  be  complex. 

2.  The  RCS  contributions  of  both  matrices  vary  monotonically  with  respect  to 
the  rotation  angle  and  are  significant  compared  to  the  PO-solution. 

3.  The  individual  RCS  contributions  are  different  from  each  other  and  show 
differences  for  horizontal  and  vertical  polarization. 

4.  For  horizontal  polarization,  the  contribution  of  edge  4  dominates,  while, 
for  vertical  polarization,  the  contribution  of  edge  2  dominates. 

5.  Since  symmetry,  see  Fig.  2.4,  is  given,  the  RCS  of  edge  2  for  horizon¬ 
tal  (vertical)  polarization  is  identical  to  the  RCS  of  edge  4  for  ver¬ 
tical  (horizontal)  polarization.  For  the  sum  matrix  one  obtains 


(3.6) 


[T| , 4 ]  =  - 


1_  e~ jkr 
2k  r 


L 


cos  (kasin<py)  (D£-D|) 


1 

0 


0 

-1 


]  2i 


jkr 


L  sin (kasimpy) (Dm+D|) 


where  the  edge  coefficients  have  the  arguments  ve  =  270°  -  q>y,  pe  =  90°. 

The  individual  RCS  contr i oution  of  edges  2  and  4  is  represented  in  Fig.  3.9 
and  Fig.  3.10.  The  r ur  RCS  of  edges  2  and  4  is  given  in  Fig.  3.11. 


Fig.  3.9 

RCS  of  edge  2  (leading  edge)  for  hori¬ 
zontal  polarization  and  of  edge  4  (trai¬ 
ling  edge)  for  vertical  polarization. 


Fig.  3.10 

RCS  of  edge  2  for  vertical  polari¬ 
zation  and  of  edge  4  for  horizon¬ 
tal  polarization. 
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Fig.  3.11 

RCS  of  edges  2  and  4  for  both 
polarizations . 
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Fig.  3.12  finally  shows  the  PO-result  plus  the  contribution  of  all  four  edges 
which  is  identical  for  horizontal  and  vertical  polarization.  One  can  confirm 
excellent  agreement  with  the  experimental  results  for  vertical  polariza¬ 
tion.  It  is,  however,  unsatisfactory  that  the  theoretical  solution  doesn't 
indicate  any  polarization  dependence. 
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Fig.  3.12 

RCS  of  the  test  panel  with 
zero  thickness  for  horizon¬ 
tal  and  vertical  polariza¬ 
tion  computed  by  PO-  and  EC 
method. 


Mathematically  this  can  be  explained  by  discussing  the  scattering  matrix 
[T2,4]  .  The  imaginary  part  has  identical  elements  for  horizontal  and  ver¬ 
tical  polarization.  The  real  part  has  elements  of  equal  magnitude  but  opposi¬ 
te  sign,  so  that,  with  the  additive  correction  of  the  purely  imaginary  PO-so- 
lution,  the  matrix  behaves  like  -a-jb  for  horizontal  and  +a-jb  for  vertical 
polarization.  This  means  that  the  final  RCS  result  (PO-solution  +  EC-solu- 
tion)  presents  identical  values  for  horizontal  and  vertical  polarization. 

This  is  not  in  agreement  with  measurements.  Physically  the  effect  is  ex¬ 
plained  by  the  fact  that  the  theory  used  does  not  take  into  account  second 
and  higher  order  diffraction  effects.  For  horizontal  polarization,  the 
fields  associated  with  double  diffraction  e.g.  are  4  kL  greater  than  the  cor¬ 
responding  vertical  polarization  contribution  [21]  .  Polarization  dependent 
effects,  however,  are  predicted  by  the  theory  for  other  diagram  cuts,  other 
panel  shapes  (e.g.  triangle)  or  other  wedge  angles. 


The  latter  is  demonstrated  for  a  flat  plate  with  dimensions  of  the  square  pa¬ 
nel,  however,  with  the  same  thickness,  namely  0.044  X,  as  in  the  experiment. 
This  plate  is  modeled  now  by  6  panels  and  12  edges  (n  =  1.5).  Without  discus¬ 
sion  if  it  is  allowed  to  use  the  theory  for  close  adjacent  edges  the  result 
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Fig.  3.13 

RCS  of  the  test  panel  with  0.044  X 
thickness  for  horizontal  polarization. 
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Fig.  3.14 

RCS  of  the  test  panel  with  0.044  X 
thickness  for  vertical  polarization. 
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of  the  computations  is  presented  in  Fig.  3.13  and  Fig.  3.14.  For  horizontal 
polarization,  the  discrepancies  between  theory  and  experiment  have  become  mi¬ 
nor,  but  the  deep  null  for  grazing  incidence  again  is  not  predicted  by  the 
theory . 


If  the  plate  is  rotated  around  the  z-axis  by  an  angle  of  <pz  =  45°  a  diagonal 
cut  can  be  achieved  by  a  following  rotation  with  <py.  The  theoretical  results 
are  presented  in  Figs.  3.15  and  3.16.  Again  no  cross-polarization  occurs 
which  is  in  agreement  with  symmetrical  properties.  The  experimental  results 
are  presented  in  Figs.  3.17  and  3.18.  Since  the  RCS-values  drop  very  quickly 
down  to  levels  of  about  -40  dB  under  the  mainlobe  it  is  not  very  meaningful 
to  discuss  minor  deviations  between  experimental  and  theoretical  results. 
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Fig.  3.15 

Theoretical  RCS  of  the  test  panel 
for  a  diagonal  cut,  horizontal  pola¬ 
rization. 
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Fig.  3.16 

Theoretical  RCS  of  the  test  panel 
for  a  diagonal  cut,  vertical  pola¬ 
rization. 
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Fig.  3.17 

Experimental  RCS  of  the  test  panel 
for  a  diagonal  cut,  horizontal  po¬ 
larization. 


Fig.  3.18 

Experimental  RCS  of  the  test  panel 
for  a  diagonal  cut,  vertical  pola¬ 
rization  . 


Finally,  the  RCS  has  been  computed  for  a  cut  where  the  rotation  axis  makes  an 
angle  of  cpz  =  30°  with  the  main  axis  (15°  with  the  diagonal)  of  the  plate.  In 
this  case  no  symmetry  occurs  and  the  theory  predicts  cross-polarization.  Be¬ 
cause  of  the  asymmetry  in  the  scattering  matrix,  the  cross-polarization  HV, 
see  Fig.  3.19,  is  slightly  different  from  the  cross-polarization  VH,  see  Fig. 
3.20.  No  experiments  could  be  carried  out  at  this  time  in  our  institute;  see 
however  [27] . 
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Fig.  3.19 

Theoretical  RCS  of  the  test  panel 
for  a  30°-cut,  HV-polarization. 
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Fig.  3.20 

Theoretical  RCS  of  the  test  panel 
for  a  30°-cut,  VH-polarization. 


This  section  is  closed  with  simple  formulas  derived  from  the  above  matrices 
to  estimate  the  effect  of  a  single  edge  in  relation  to  the  RCS  peak  value  of 
a  panel  computed  by  P0.  The  panel  may  be  of  arbitrary  polygonal  structure 
with  size  A.  The  edge  under  consideration  has  the  length  L,  is  hit  normally 
by  the  incident  wave  and  is  rotated  around  the  y-axis  by  the  angle  <py  (this 
is  just  the  situation  of  edge  2  in  Fig.  3.1).  Relating  the  magnitude  of  the 
elements  of  the  scattering  matrix  given  by  Eq.  (3.4)  to  the  peak  value  of  the 
PO-scattering  matrix,  one  obtains  for  horizontal  polarization 

L  |d*| 

(3.7)  rHH  =  20  log  , 


and  for  vertical  polarization 

L  |D*| 

(3.8)  rvv  =  20  log  — —  . 

The  arguments  of  d£,  dJ  are  ye  =  270°  -  <py,  (3e  =  90°. 

Using  the  values  for  d£  and  D|  given  in  Fig.  2.4  for  a  half-plane  (n  =  2)  and 
in  Fig.  2.5  for  a  step  (n  =  1.5),  the  following  table  may  be  established 
choosing  the  test  panel  of  this  section  as  an  example. 


<Py 

0° 

45° 

90° 

135° 

180° 

225° 

270° 

half¬ 

plane 

(n=2) 

rHH  [dB] 

-36.1 

-40.8 

—  oo 

-40.8 

-36.1 

-33.1 

-30.1 

rvv  [dB] 

-36.1 

-33.1 

-30.1 

-33.1 

-36.1 

-40.8 

—  oo 

step 
(n=l . 5) 

rHH  [dB] 

-32.3 

-36.8 

-44.5 

-42.8 

-44.5 

-36.8 

-32.3 

rvv  [dB] 

—  oo 

-40.3 

-34.8 

-35.4 

-34.8 

-40.3 

—  oo 

Table  3.1  Level  of  edge  diffraction  effects  related  to  the  PO-peak  value  for 
the  square  test  panel  with  edge  length  L=5 . 0 8  X. 


38 


4 ,  Application  of  the  EC-Method  for  a  Circular  Cylinder  and  a  Double  Dihedral 
4.1  Circular  Cylinder 

The  test  cylinder  has  a  length  of  5  X  and  a  diameter  of  1  X.  The  wave  length 
is  1128  mm.  The  broadside  RCS  amounts  to  20  dBsm  and  the  front  face  RCS  to 
9.9  dBsm.  The  cylinder  is  rotated  around  an  axis  vertically  to  its  own  axis. 
Theoretical  results  for  the  smooth  cylinder  based  on  PTD  are  published  toget¬ 
her  with  experimental  results  by  Ufimtsev  in  [6] .  The  circumference  of  the 
cylinder  was  modeled  (see  Fig.  4.1)  by  14  rectangular  panels  with  dimensions 
5  X  x  0.22  X.  The  deviation  from  the  true  cylinder  surface  was  about  X / 8 0 . 
Each  of  the  front  faces  was  modeled  with  14  triangles.  In  the  geometrical  mo¬ 
del  artificial  edges  (n  =  1.29)  arise  between  the  rectangular  panels.  The  na¬ 
tural  circular  edges  (n  =  1.5)  between  the  cylinder  circumference  and  the 
front  faces  are  approximated  by  straight  lines  of  length  0.22  X.  Both  types 
of  edges  are  treated  in  the  same  way  by  the  theory. 


Fig.  4.1 

Panel  model  of  the  test  cylinder. 


ASPECT  ANCLE  (deg) 


Fig.  4.2 

RCS  of  the  clinder  modeled  by 
panels,  PO-solution. 


The  PO-solution,  insensitive  to  polarization,  is  presented  in  Fig.  4.2.  The 
results  of  the  experiment  and  of  the  PTD-theory  for  horizontal  and  vertical 
polarization  are  given  in  Fig.  4.3.  The  broadside  peak  of  the  theoretical 
curves,  however,  should  not  exceed  20  dBsm.  The  pictures  of  Fig.  4.4,  finally 
present  the  results  of  the  procedure  outlined  in  this  paper.  For  this  special 
cut  they  should  be  identical  to  the  results  of  Ufimtsev.  This  is  the  case 
except  for  the  difference  at  broadside  incidence  and  some  deviations  of  mi¬ 
nor  practical  interest. 
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Fig.  4.3  RCS  of  the  smooth  cylinder,  experiment  ( - )  and  PTD-solution  ( — ) 

left  side:  HH-polarization,  right  side:  W-polarization. 


•J  o  • 


0  10  20  30  40  50  60  70  80  90 

ASPECT  ANCLE  (deg)  - - 


♦  10 


0  10  20  30  40  50  60  70  80  90 

ASPECT  ANGLE  (deg)  - - 


Fig.  4.4  RCS  of  the  cylinder  modeled  by  panels,  P0-  and  EC-solution, 
left  side:  HH-polarization,  right  side:  W-polarization. 


4 . 2  Double  Dihedral 


A  double  dihedral  constructed  on  the  basis  of  a  cube  with  additional  shado¬ 
wing  surfaces,  see  Fig.  4.5,  is  rotated  in  an  unconventional  way  as  shown  by 
Fig.  4.6.  For  a  =  0°  the  edges  of  the  double  dihedral  make  an  angle  of  45° 
with  the  axis  of  rotation.  The  purpose  was  to  generate  a  strong  depolarized 
backscattered  field.  Previous  PO-results  are  published  for  the  main  cut  in  [2] 
and  for  the  diagonal  cut  within  a  restricted  range  of  aspect  angles  in  [4]  . 


Lk' 

I 


Fig.  4.5 

Cube  with  additional  shado¬ 
wing  faces  forming  a  double 
dihedral,  dimensions  in  mm. 


Fig.  4.6  Geometry  and  axis  of  rotation. 

Figs.  4.7,  4.8  and  4.9  present  the  results  for  VH-polarization  of  experiment, 
of  PO  including  double  reflection  and  of  PO  +  EC  for  the  full  range  of  aspect 
angles  and  a  frequency  of  15.5  GHz  (X  =  19.4  mm) .  The  measurements  had  to  be 
arranged  with  great  care  since  a  wide  dynamic  range  was  needed.  In  addition, 
the  exact  positioning  of  the  double  dihedral  according  Fig.  4.6  caused  major 
problems . 

The  structure  of  the  pattern  around  0°  is  well  represented  by  PO  alone  but 
the  decrease  is  too  rapid  with  increasing  aspect  angles.  In  addition,  the 
spikes  at  -235°,  -180°,  -125°,  -55°  and  +55°  are  not  predicted  by  PO.  This, 
however,  is  the  case  when  the  EC-fieid  is  added  to  the  PO-field.  The  spikes, 
therefore,  are  due  to  edge  diffraction  only. 


ASPECT  ANCLE  (deg) 


Fig.  4.7  PO-result  for  the  double  dihedral,  VH-polarization. 
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Fig.  4.8  Experimental  result. 
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Fig.  4.9  P0-  and  EC-result . 
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5 .  Conclusion 


The  PO-method  is  frequently  applied  with  good  success  to  predict  the  RCS  of 
large  and  complicated  structures  modeled  by  a  collection  of  quadrangular  and 
triangular  panels.  However,  there  are  certain  situations  (specific  structu¬ 
res,  pattern  cuts,  polarizations)  for  which  a  correction  of  the  PO-field  by 
an  edge  diffracted  field  is  required. 

In  this  paper  the  concept  of  equivalent  currents  (EC)  is  applied.  The  fringe 
currents  of  Michaeli  are  used  to  derive  the  scattering  matrix  of  an  isolated 
edge  with  arbitrary  length  and  orientation  in  an  observer  fixed  coordinate 
system.  The  relationship  of  the  method  to  other  theories  which  are  concerned 
with  edge  diffraction  are  summarized  by  some  statements.  The  theory  is  ap¬ 
plied  for  the  analysis  of  a  square  flat  plate,  of  a  cylinder  and  of  a  double 
dihedral.  All  theoretical  results  are  compared  with  measurements. 

The  square  flat  plate  with  edge  length  5.08  X  is  analysed  in  great  detail. 

The  results  for  the  principal  plane  pattern  are  in  excellent  agreement  for 
vertical  polarization,  independent  of  whether  the  plate  was  assumed  with  zero 
thickness  or  with  0.044  X  thickness  (as  used  for  the  measurements).  For  hori¬ 
zontal  polarization  the  agreement  is  unsatisfactory.  In  the  case  of  zero 
thickness  beyond  that  identical  results  for  vertical  and  horizontal  polari¬ 
zation  are  obtained.  These  effects  are  due  to  second  order  diffraction  ef¬ 
fects  neglected  in  the  EC-theory.  This  means  that  further  effort  is  required 
if  one  is  interested  in  the  improvement  of  RCS  calculations  of  an  isolated 
plate. 

The  method  is  further  applied  for  a  cylinder  with  length  5  X  and  diameter 
1  X.  The  circumference  was  modeled  by  14  rectangular  panels,  thus  introducing 
artificial  edges.  Each  of  the  front  faces  consisted  of  14  triangular  panels, 
thus  modeling  a  circular  edge  by  short  straight  edges.  The  RCS-results  of  the 
EC-method  are  in  good  agreement  with  those  of  the  PTD-method  and  experiment, 
both  applied  for  a  smooth  cylinder. 

Finally  the  concept  is  used  for  a  double  dihedral  which  was  positioned  in 
such  a  way  that  strong  depolarizations  could  occur  during  rotation.  Also  in 
this  case  the  correction  of  the  PO-field  by  the  fringe  current  field  was  very 
efficient.  The  calculated  RCS-values  are  again  in  good  agreement  with  experi¬ 
mental  results. 

So  one  can  conclude  that  the  implementation  of  the  presented  procedure  in  a 
computer  program  would  be  efficient  enough  to  treat  edge  diffraction  effects 
with  sufficient  accuracy  under  practical  viewpoints. 
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Appendix 

The  coefficients  D|,  d£,  D|m  of  the  backscattering  matrix  are  given  by  the 
theory  of  Michaeli  [9].  For  more  details  see  also  [15]. 
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Diem  =  "  COSpe  (Hi  +  COSVe)  gi 


Dfe  =  -  U(Jl-Ve) 


+  COSVe  ' 


D?S  =  D?g  , 


(A7)  D?gm  =  0  . 


(A8)  m  =  COSVe - / 

tan2pe 

(A9)  oi  =  arccosui  , 


(A10)  U(x) 


■i\) 


x  >  0  , 
x  <  0  . 


The  coefficients  D„  ,  DP°  result  from  D,  ,  DP°  by  the  following  transforma- 
,  .  2v  2v  iv  lv 

tions : 


(All)  Ve  ->  nn-Ve  ,  Pe  K~Pe 
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Application  of  Parallel  Processing 
to  a  Surface  Patch  /  Wire  Junction  EFIE  Code 

L.  C.  Russell,  J.  W.  Rockway 
Naval  Ocean  Systems  Center 
San  Diego,  California  92152 


ABSTRACT 

A  surface  patch  /  wire  junction  EFIE  method  of  moment  algorithm,  JUNCTION,  developed 
by  the  University  of  Houston  has  been  implemented  in  a  transputer  based  parallel  processing 
environment  installed  in  a  personal  computer.  This  paper  addresses  transputer  hardware  and 
software  options,  the  JUNCTION  algorithm,  techniques  for  parallelizing  matrix  analysis  algo¬ 
rithms,  and  performance  results.  The  transputer  array  was  found  to  provide  a  flexible,  low-cost, 
high  performance  desktop  computing  environment  for  method  of  moment  analysis. 

1.0  INTRODUCTION 

This  paper  presents  the  application  of  parallel  processing  techniques  to  an  existing  compu¬ 
tational  electromagnetic  (CEM)  code.  The  goal  was  to  demonstrate  that  high  performance  com¬ 
puting  (HPC)  can  improve  the  utility  and  efficiency  of  computational  techniques  used  in 
electromagnetic  ship  analysis  [Li,  1988].  The  parallel  processing  platform  was  an  array  of  four 
transputers  on  a  board  inside  an  IBM-compatible  PC.  The  transputers  were  run  using  the  ParaSoft 
EXPRESS  operating  environment  [ParaSoft,  1990].  This  operating  environment  was  chosen  to 
allow  portability  of  the  code  to  other  HPC  platforms  and  minimize  the  number  of  required  pro¬ 
gramming  changes  to  the  CEM  code.  The  selected  CEM  code  was  the  method  of  moment  (MoM) 
algorithm  JUNCTION  developed  by  the  University  of  Houston  [Hwu  et  al,  1988;  Wilton  et  al, 
1988]. 

2.0  PARALLEL  PROCESSING 

The  concept  of  parallel  processing  has  been  around  for  a  long  time  but  only  recently  was  its 
utility  generally  recognized.  By  1986  more  than  a  dozen  companies  were  either  selling  or  in  the 
process  of  building  parallel  processors  [Tazelaar,  1988]. 

2.1  Transputer  hardware 

The  hardware  platform  chosen  was  an  array  of  four  transputers  on  a  motherboard  which  could 
be  installed  inside  a  PC.  This  decision  was  based  on  the  low  cost,  ease  of  use,  and  flexibility  of  a 
transputer  based  system.  Transputer  systems  provide  an  inexpensive  entry  into  the  world  of  parallel 
processing.  For  this  entry  level  application  a  decision  was  made  to  purchase  only  four  floating 
point  transputer  modules  (TRAMs)  each  with  one  Megabyte  of  external  RAM.  These  size  1  TRAMs 
are  an  industry  standard  and  provide  sufficient  memory  for  most  applications  at  a  reasonable  cost 
and  in  a  compact  package.  The  T800  transputer  used  has  a  peak  performance  of  1 .5  Mflops. 

2.2  Transputer  Software 

There  are  several  different  ways  to  program  an  array  of  transputers.  Each  method  has 
trade-offs  in  portability,  performance,  andease  of  use.  The  parallel  language,  Occam,  was  developed 
concurrently  with  the  transputer.  Parallel  versions  of  high  level  languages  such  as  Fortran,  C,  Pascal, 
and  Modula  2  can  be  obtained.  The  parallel  operating  environment  EXPRESS  (ParaSoft  Corpo¬ 
ration)  is  available  for  transputer  systems. 
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2.2.1  Occam 

There  are  a  number  of  advantages  to  using  Occam  for  programming  a  transputer  system. 
Since  Occam  was  developed  simultaneously  with  the  transputer,  it  is  one  of  the  few  languages 
designed  for  concurrency.  Occam  has  the  performance  and  efficiency  of  assembly  language.  Occam 
can  be  used  as  a  harness  to  link  modules  written  in  other  industry  standard  languages.  Its  other 
features  include:  well  implemented  timing  capabilities,  straightforward  control  of  process  sche¬ 
duling,  and  a  high  level  of  structure. 

The  main  disadvantage  to  using  Occam  is  that  programs  written  in  Occam  are  not  portable 
to  other  parallel  computers.  Existing  codes  written  in  other  languages  need  to  be  completely  recoded. 
Occam  works  only  with  the  transputer.  Other  disadvantages  are  that  Occam  does  not  support  many 
features  of  other  high  level  languages  (no  recursion  or  dynamic  memory  allocation),  use  of  it  requires 
learning  a  new  language,  and  Occam  may  not  be  supported  in  the  future.  In  the  trade-off  space 
(portability  versus  performance  versus  ease  of  use)  Occam  scores  high  only  in  performance. 

2.2.2  High  Level  Languages 

The  advantage  of  using  a  parallel  version  of  a  high  level  language  such  as  Fortran,  C,  Pascal, 
or  Modula  2  is  that  one  can  use  a  language  with  which  one  is  already  familiar.  All  that  is  required 
is  to  learn  the  parallel  extensions.  Existing  sequential  code  can  be  ported  over. 

There  are,  however,  a  number  of  disadvantages  to  using  a  parallel  version  of  a  high  level 
language.  Programming  an  array  of  transputers  requires  the  user  to  develop  both  a  host  program 
and  a  node  program.  The  host  program  runs  on  the  host  processor  and  handles  I/O  calls  and  manages 
the  other  processors.  The  node  program  runs  on  all  the  other  processors  and  does  the  bulk  of  the 
computations.  The  user  is  required  to  maintain  two  programs  instead  of  the  usual  one  program. 
This  can  make  program  development  and  maintenance  somewhat  complex.  Performing  I/O,  making 
system  calls,  and  measuring  timing  can  be  very  difficult.  Debuggers  for  parallel  versions  of  high 
level  languages  are  starting  to  appear,  but  they  are  not  widely  available.  Debugging  parallel  pro¬ 
grams  is  very  difficult.  In  addition  tc  all  this,  parallel  versions  of  high  level  languages  do  not  offer 
the  performance  that  one  can  get  from  Occam. 

2.2.3  ParaSoft  EXPRESS 

Parallel  operating  environments  such  as  ParaSoft’ s  EXPRESS  allow  the  user  the  convenience 
of  using  a  familiar  high  level  language  while  mitigating  some  of  the  disadvantages  to  parallel  high 
level  languages  which  were  mentioned  in  the  previous  section.  EXPRESS  is  based  on  the  CUBIX 
programming  model  which  was  developed  at  CalTech.  EXPRESS  is  available  for  both  Fortran  and 
C.  There  is  no  need  to  develop  both  a  host  and  node  program.  Instead  only  one  program  (which 
runs  on  all  processors)  needs  to  be  developed  and  maintained.  EXPRESS  does  not  include  a  compiler 
so  one  of  the  parallel  high  level  language  compilers  mentioned  above  is  still  needed  to  compile  and 
link  the  code.  During  linking  the  EXPRESS  library  is  linked  into  the  user’s  source  code.  The 
EXPRESS  library  provides  simple  Fortran  or  C  language  function  calls  to  handle  I/O  and  system 
calls.  EXPRESS  also  provides  timing  capabilities,  a  source  level  debugger,  and  a  performance 
monitor.  The  number  of  processors  being  used  does  not  have  to  be  hardwired  into  the  code.  Instead 
this  is  specified  at  runtime  by  a  switch  in  the  run  command.  In  addition,  EXPRESS  is  available  for 
a  wide  variety  of  parallel  machines,  not  just  the  transputer.  Currently  EXPRESS  is  available  for: 
the  multi-headed  IBM  3090  (AIX)  and  the  multi-headed  CRAY  (UNICOS)  mainframes;  the  Intel 
iPSC  2  and  iPSC  860;  the  nCUBE  1  and  2;  PC,  MAC,  and  SUN  hosts  for  transputers;  and  the  IBM 
RS/6000,  Silicon  Graphics,  and  SUN  workstations.  All  this  leads  to  very  portable  source  code. 
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The  only  significant  disadvantage  to  EXPRESS  is  that  it  possibly  degrades  system  per¬ 
formance  to  some  extent.  However,  this  disadvantage  is  more  than  offset  by  the  advantage  of  having 
portable  code.  For  these  reasons  EXPRESS  by  ParaSoft  was  chosen  for  the  operating  environment. 
Portability  was  the  key  feature  since  the  ultimate  goal  of  this  effort  was  to  run  the  code  on  a  high 
performance  computer.  The  language  chosen  was  Fortran  since  this  is  the  principal  language  of 
the  JUNCTION  method  of  moments  code.  For  obvious  reasons,  it  was  desirable  to  make  as  few 
changes  to  the  existing  code  as  possible. 

3.0  The  JUNCTION  METHOD  OF  MOMENTS  CODE 

The  JUNCTION  computer  code  invokes  the  method  of  moments  to  solve  a  coupled  electric 
field  integral  equation  (EFlE)  for  the  currents  induced  on  an  arbitrary  configuration  of  perfectly 
conducting  bodies  and  wires  [Wilton  et  al,  1989].  There  are  three  principal  advantages  to  the 
JUNCTION  formulation.  First,  the  EFEE  formulation  for  the  surfaces  of  bodies,  in  contrast  to  the 
magnetic  field  integral  equation  (MFEE),  applies  toopen  bodies.  Second,  JUNCTION  allows  voltage 
and  load  conditions  to  be  easily  specified  at  terminals  defined  on  the  structure.  Third,  the  triangular 
patches  of  JUNCTION  are  the  simplest  planar  surfaces  which  can  be  used  to  model  arbitrary  surfaces 
and  boundaries,  and  triangular  patches  permit  patch  densities  to  be  varied  locally  so  as  to  model  a 
rapidly  varying  current  distribution. 

The  method  of  moments  is  a  numerical  procedure  for  solving  field  integral  equations  [Har¬ 
rington,  1968].  Basis  functions  are  chosen  to  represent  the  unknown  currents.  Testing  functions 
are  chosen  to  enforce  the  integral  equation  on  the  surface  of  the  conducting  structure.  With  the 
choice  of  basis  and  testing  functions  a  matrix  approximating  the  integral  equation  is  derived.  In 
JUNCTION  a  system  of  N  linear  equations  results,  where  N  =  NB  +  Nw  +  N;. 
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where  Z  is  the  impedance  matrix,  /  is  the  current  vector,  and  E  is  the  excitation  vector.  The 
superscripts  are  B  for  body  elements,  W  for  wire  elements,  and  J  for  junctions  between  wires  and 
bodies.  The  analytical  expressions  for  each  of  the  individual  submatrices  are  different.  Solution 
of  the  linear  system  of  equations  yields  the  set  of  unknown  coefficients  used  in  the  representation 
of  the  surface,  wire,  and  junction  currents.  Once  these  currents  are  known,  the  scattered  field  or 
any  other  electromagnetic  quantity  of  interest  may  be  determined. 

4.0  IMPLEMENTATION  OF  SEQUENTIAL  PROGRAM 

The  first  step  in  parallelizing  an  existing  program  is  to  verify  the  sequential  version  of  the 
program.  This  is  done  by  running  the  program  on  a  serial  computer.  Once  this  has  been  successfully 
done,  the  next  step  is  to  get  the  program  to  run  on  a  single  processor  on  the  parallel  computer.  This 
is  necessary  to  eliminate  any  problems  caused  by  compiler  differences  between  the  serial  compiler 
and  the  parallel  compiler  and  also  to  "check  out"  the  hardware.  Only  after  this  has  been  completed 
can  the  parallelization  of  the  program  be  considered. 
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JUNCTION  was  broken  into  four  separate  programs:  DATGN,  GEOMETRY,  CURRENT, 
and  FIELD.  DATGN  creates  input  data  set  files.  GEOMETRY  calculates  the  geometrical 
parameters.  CURRENT  computes  the  currents  and  impedance.  FIELD  computes  charges,  near 
fields,  far  fields,  radar  cross  sections,  and  power  gain.  The  computationally  intensive  programs  are 
CURRENT  and  FIELD.  This  paper  is  concerned  with  the  parallelizing  of  CURRENT. 

From  a  computational  standpoint  CURRENT  consists  of  filling  the  impedance  matrix  and 
the  excitation  vector,  and  then  solving  for  the  current  vector.  The  current  vector  gives  the  currents 
on  the  bodies,  wires  and  junctions.  The  methods  used  are  standard  matrix  manipulation  techniques. 
Once  the  matrix  is  filled,  Gaussian  elimination  is  performed  using  LU  decomposition  with  partial 
pivoting  to  factor  the  matrix.  Finally,  the  current  vector  is  solved  for  by  using  backward  and  forward 
substitution.  The  library  subroutines  used  are  the  LINPACK  routines  CGEFA  and  CGESL 
[Dongarra  et  al,  1979],  CGEFA  factors  a  general  complex  matrix.  CGESL  solves  a  complex  matrix 
equation  using  the  output  from  CGEFA. 

5.0  PARALLEL  MATRIX  FACTOR  AND  SOLVE 
5.1  Parallel  LINPACK 

When  this  effort  was  initiated,  general  parallel  versions  of  the  LINPACK  subroutines  were 
not  available.  This  is  because  parallel  subroutines  are  by  their  nature  very  machine  specific  and 
one  of  the  goals  of  the  LINPACK  project  was  for  the  subroutines  to  be  machine  independent. 

On  sequential  computers  the  number  of  computations  required  to  factor  a  matrix  into  a  product 
of  triangular  matrices  is  of  the  order  (n3)  where  n  is  the  dimension  of  the  matrix.  In  comparison, 
the  number  of  computations  required  for  the  triangular  solution  are  of  the  order  ( n 2).  Because  of 
this,  unless  multiple  solutions  are  required,  most  effort  has  focused  on  optimizing  the  factorization 
algorithm. 

On  parallel  computers  good  efficiency  is  more  difficult  to  obtain  for  matrix  solving  compared 
to  matrix  factoring.  This  is  because  much  more  interprocessor  communication  is  required  during 
solving.  Solving  is  inherently  a  finer  grain  process  than  factoring.  Various  parallel  algorithms  have 
been  proposed.  Each  one  has  its  advantages  and  disadvantages  which  are  dependent  on  the  number 
of  processors  being  used,  the  size  of  the  problem  being  solved,  and  the  relative  cost  of  communication 
and  computation. 

The  matrix  columns  were  distributed  onto  the  processors  using  an  interleaving  technique 
known  as  column  wrap  mapping.  Column  wrap  mapping  has  good  load  balancing  properties  and 
gives  excellent  performance  when  doing  LU  decomposition. 

Intel  Corporation  provided  the  source  code  for  Intel’s  parallel  versions  of  DGEFA  andDGESL 
from  LINPACK.  DGEFA  and  DGESL  are  double  precision  non-complex  versions  of  CGEFA  and 
CGESL,  respectively.  Included  were  two  different  matrix  solving  routines:  a  cyclic  version  and  a 
wavefront  version  of  DGESL.  The  differences  between  these  versions  are  described  below.  These 
routines  were  all  developed  for  Intel’s  iPSC  parallel  computer  so  a  number  of  modifications  were 
needed  to  implement  them  under  EXPRESS  on  the  transputer  array.  The  matrix  factoring  routine 
was  straightforward  to  parallelize  and  provided  very  high  efficiencies. 

The  difficult  problem  was  the  solution  of  the  lower  triangular  linear  system 

Lx-b, 
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where  L  is  a  lower  triangular  matrix  of  order  n,  b  is  a  known  vector  of  dimension  n,  and  x  is  the 
unknown  solution  vector  of  dimension  n.  The  sequential  version  of  LINPACK  solves  this  by 
forward  substitution  using  the  following  doubly  nested  loop: 

for  j  =  1  to  n 

Xj  =  bjILjj 

for  i  =  j+ 1  to  n 
bi-bi-XjL^ 

There  are  a  number  of  ways  to  parallelize  this  problem  on  a  distributed  memory  parallel 
computer.  Several  algorithms  have  been  developed.  Two  of  the  most  popular  are  the  wavefront 
algorithm  and  the  cyclic  algorithm  [Heath  et  al,  1988].  These  algorithms  assume  the  matrix  is 
column  wrap  mapped  onto  the  processors. 

The  wavefront  algorithm  breaks  the  updating  of  b  into  segments  and  pipelines  the  segments 
through  the  processors  in  a  wavefront  fashion.  The  n-vector  z  is  used  to  accumulate  the  updates  of 
b  so  that  the  components  of  b  remain  distributed  among  the  processors.  The  size  of  the  circulated 
segments  is  an  adjustable  parameter  that  controls  the  granularity  of  the  algorithm  and  therefore 
affects  performance.  The  optimal  value  of  the  segment  size  depends  on  the  characteristics  of  the 
hardware. 

The  wavefront  algorithm  is  written  in  pseudo-code  as: 

for  j  e  mycols 

for  k  =  1  to  #  segments 
receive  segment 
if  k  =  1  then 

xj  -  (bj  ~  zj)!Ljj 

segment  =  segment  -  {z,} 
for  z,  €  segment 
z,  =  z,  +xJLij 

if  \segment\  >  0  then 
send  segment  to  processor  map(j  +1) 

The  cyclic  algorithm  is  similar  to  the  wavefront  algorithm  in  that  they  both  send  a  segment 
z  between  processors.  However,  the  cyclic  algorithm  circulates  a  single  segment  of  the  fixed  size 
p- 1 ,  where  p  is  the  number  of  processors.  The  name  cyclic  comes  from  the  segment  cycling  through 
all  other  processors  before  returning  to  a  given  processor.  The  updates  computed  by  the  processor 
while  the  segment  is  circulating  elsewhere  are  stored  in  the  vector  t.  The  cyclic  algorithm  is  written 
in  pseudo-code  as: 

for  j  e  mycols 
receive  segment 

Xj  —  ( bj  —  Zj  —  tj)ILjj 

segment  =  segment  -  {z,} 
for  Zj  e  segment 
z,  =  z,  + 1,  +  XjLv 

Zj  +  p  -  1  ~  tj+p-  I  +p  —  1  ,y 
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segment  =  segment  kj{zj+p_1} 
send  segment  to  processor  map(j+ 1) 
for  i‘  =  j  +  p  to  n 
ti  =  ti+xjLij 

Theoretical  analysis  shows  that  the  cyclic  algorithm  performs  best  on  a  small  number  of 
processors  whereas  the  wavefront  algorithm  performs  best  on  a  large  number  of  processors. 

5.2  Implementation  of  parallel  LINPACK  under  EXPRESS 

A  number  of  modifications  were  needed  to  the  Intel  parallel  LINPACK  routines  in  order  to 
implement  them  on  the  transputer  array  operating  under  ParaSoft  EXPRESS.  These  modifications 
fell  into  three  categories: 

1.  converting  double  precision  variables  to  complex  variables, 

2.  using  complex  LINPACK  routines  instead  of  the  equivalent  double  precision 
LINPACK  routines, 

3.  using  EXPRESS  communication  functions/routines  instead  of  iPSC  communi¬ 
cation  functions/routines. 

5.3  Results 

Performance  measurements  were  made  of  both  the  parallel  matrix  factor  routine  and  the  two 
parallel  matrix  solve  routines.  Timings  were  made  using  one,  two,  three  and  four  processors.  The 
timing  measurements  were  converted  to  the  standard  performance  measurement  parameters, 
speed-up  and  efficiency.  Speed-up  is  the  ratio  of  the  execution  time  of  the  algorithm  on  a  single 
processor  to  the  execution  time  of  the  parallel  algorithm  on  the  n  processors.  Efficiency  is  the 
speed-up  divided  by  n  and  expressed  as  percent. 

Due  to  memory  limitations,  examples  with  more  than  200  unknowns  could  not  be  run  on  a 
single  transputer.  Several  sample  problems  were  developed  to  test  out  the  parallel  factor  and  solve 
routines.  The  four  sample  problems  had  the  following  number  of  unknowns:  104,  116,  155,  and 
189. 

5.3.1  Factor 

Figure  5-1  shows  the  timing  results  on  the  parallel  version  of  CGEFA  for  the  four  sample 
problems.  Figure  5-2  shows  speed-up  and  figure  5-3  shows  efficiency.  The  important  issue  is  that 
as  the  number  of  unknowns  increases  the  parallel  matrix  factoring  routine  becomes  more  efficient. 
This  is  expected  since  small  problems  do  not  achieve  good  load  balancing  and  cannot  benefit  greatly 
from  parallel  computation.  An  efficiency  of  over  90%  is  considered  very  good. 
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Figure  5-1.  Timing  for  parallel  matrix  factoring  routine. 
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Figure  5-2.  Speedup  achieved  for  parallel  matrix  factoring  routine. 


54 


Figure  5-3.  Efficiency  achieved  for  parallel  matrix  factoring  routine. 


5.3.2  Solve 

The  performance  of  the  two  parallel  matrix  solve  routines  were  measured  for  all  the  example 
problems.  The  results  are  shown  here  for  the  largest  problem,  the  one  with  189  unknowns.  Table 
5-1  shows  the  results  for  the  cyclic  algorithm. 


Table  5-1.  Performance  results  for  cyclic  algorithm  for  189  unknowns. 


Number  of 
Processors 

Time, 

seconds 

Speedup 

Efficiency 

1 

0.621 

1.00 

100% 

2 

0.367 

1.69 

84.5% 

3 

0.257 

2.42 

80.7% 

4 

0.207 

3.00 

75.0% 

The  performance  results  for  the  wavefront  algorithm  are  complicated  by  the  presence  of  the 
adjustable  segment  size  parameter.  This  segment  size  parameter  can  range  from  1  to  the  number 
of  unknowns.  Table  5-2  shows  the  optimal  performance  segment  size  and  time.  The  optimal 
performance  segment  size  is  a  function  of  the  number  of  processors  (as  well  as  the  number  of 
unknowns).  In  addition,  by  comparing  the  timing  results  for  the  wavefront  algorithm  with  the  results 
for  the  cyclic  algorithm  it  can  be  seen  that  even  with  the  optimum  segment  size  the  performance 
for  the  wavefront  algorithm  is  much  worse  than  the  cyclic  algorithm.  For  these  two  reasons  the 
wavefront  algorithm  was  rejected  for  use  on  the  size  of  problem  which  could  be  run  on  the  four 
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processor  transputer  array.  However,  it  must  be  noted  that  the  performance  results  could  turn  out 
to  be  very  different  on  a  high  performance  computer  with  many  more  processors  running  larger 
problems. 


Table  5-2.  Performance  results  for  wavefront  algorithm  for  189  unknowns. 


Number  of 
Processors 

Optimal 
segment  size 

Time, 

seconds 

1 

95 

0.741 

2 

72 

0.555 

3 

34 

0.448 

4 

24 

0.380 

6.0  PARALLEL  MATRIX  FILLING 

Developing  a  highly  efficient  general  routine  for  parallel  matrix  filling  was  much  more  dif¬ 
ficult  in  comparison  to  the  work  required  to  parallelize  the  matrix  factor  and  solve  routines.  This 
was  due  to  several  reasons :  the  large  variety  of  problem  geometries  being  solved,  shared  calculations 
between  matrix  columns,  memory  limitations,  and  concurrent  filling  of  the  excitation  vector. 

6.1  Sequential  matrix  filling 

In  JUNCTION  the  columns  of  the  matrix  correspond  to  source  points  on  the  structure.  The 
rows  of  the  matrix  correspond  to  observation  points  on  the  structure.  The  body  columns/rows  come 
before  the  wire  columns/rows,  but  otherwise  the  ordering  of  the  columns/rows  depends  on  how  the 
problem  was  specified  in  the  input  data  portion  of  the  code.  The  junction  points  are  scattered  within 
the  wire  portion  of  the  matrix.  Wire  columns  correspond  to  nodes  on  the  wires  whereas  body 
columns  correspond  to  sides  of  triangular  patches  on  the  surfaces.  A  junction  is  a  node  on  a  wire, 
hence  its  location  is  in  the  wire  portion  of  the  matrix. 

6.2  Column  mapping  techniques 

There  are  a  wide  variety  of  techniques  which  can  be  used  to  map  the  columns  of  the  matrix 
onto  the  processors.  A  column  wrap  mapping  technique  was  used  for  matrix  factoring  and  solving 
as  described  in  the  previous  chapter. 

It  quickly  became  apparent  that  column  wrap  mapping  was  not  necessarily  the  best  method 
to  use  for  parallelizing  the  matrix  filling  portion  of  the  code.  Adjacent  columns  of  the  matrix  are 
often  associated  with  each  other.  For  bodies  adjacent  columns  are  usually  either  on  the  same  face 
or  on  an  attached  face.  For  wires  adjacent  columns  often  refer  to  nodes  which  are  joined  by  the 
same  segment.  In  JUNCTION  calculations  are  made  with  respect  to  faces  and  segments,  not  edges 
and  nodes.  This  avoids  duplication  of  work  in  a  serial  computation.  Alternative  methods  for 
mapping  the  columns  onto  the  processors  were  needed.  (Note:  mapping  the  rows  onto  the  pro¬ 
cessors,  instead  of  the  columns,  was  never  considered  since  matrix  factoring  required  the  columns 
to  be  already  mapped  onto  the  processors). 
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Alternative  mapping  techniques  were  devised  and  evaluated.  The  techniques  and  performance 
results  are  described  below  in  the  section  on  structural  dependencies.  It  was  found  that  using  a 
different  mapping  technique  to  fill  the  matrix  did  not  compromise  the  output  from  matrix  solving. 
The  only  effect  was  to  scramble  the  output  vector.  The  output  vector  is  easily  and  rapidly 
unscrambled  at  the  end  of  the  matrix  solve  routines.  Unscrambling  the  output  vector  at  the  end  is 
much  faster  than  unscrambling  the  matrix  before  the  matrix  factor  routine,  since  much  less  inter¬ 
processor  communication  is  required  for  the  former. 

6.3  Structural  dependencies 

The  performance  of  the  parallel  matrix  filling  algorithm  was  found  to  be  very  dependent  on 
the  structure  being  analyzed.  The  first  types  of  problems  analyzed  were  for  homogeneous  structures. 
Homogeneous  structures  are  defined  as  having  either  all  body  elements  or  all  wire  elements.  Next, 
heterogeneous  structures  were  evaluated.  Heterogeneous  structures  are  defined  as  having  a  mixture 
of  body,  wire,  and  junction  elements. 

6.3.1  Homogeneous  structures 

Six  types  of  structures  were  considered:  straight  wires,  cylinders,  plates,  cones,  disks,  or 
spheres.  The  performance  of  the  parallel  matrix  filling  routines  was  evaluated  for  homogeneous 
problems  made  up  of  each  of  these  types  of  structures.  Two  column  mapping  techniques  were 
compared  for  each  structure  type.  The  two  techniques  used  were  column  wrap  mapping  and  column 
block  mapping.  Column  wrap  mapping  consists  of  interleaving  the  columns  onto  the  processors. 
It  is  described  in  detail  in  Section  6.2.  Column  block  mapping  consists  of  distributing  the  columns 
onto  the  processors  using  contiguous  blocks  of  columns  rather  than  individual  columns.  As  an 
example,  if  4  processors  were  being  used  to  fill  a  matrix  with  100  columns,  using  column  block 
mapping  would  mean  that  the  first  processor  would  fill  the  first  25  columns,  the  second  processor 
would  fill  the  second  25  columns,  and  so  on. 

Four  sample  problems  were  developed  for  each  structure  type.  The  number  of  unknowns 
ranged  from  40  to  200.  Each  sample  problem  was  run  on  both  a  single  transputer  and  four  transputers. 
Measurements  were  made  of  both  matrix  filling  time  and  matrix  solving  time  for  both  column  wrap 
mapping  and  column  block  mapping. 

Table  6-3  shows  the  measured  times  and  calculated  efficiencies  for  matrix  filling  for  the 
different  structures  using  wrap  and  block  mapping. 

For  a  single  processor  it  is  seen  that  an  all  wire  homogeneous  matrix  takes  longer  to  fill  than 
an  all  body  homogeneous  matrix  of  the  same  size.  On  average  an  all  wire  matrix  took  about  38% 
longer  to  fill  than  the  same  size  all  body  matrix.  Improvements  are  being  been  made  in  JUNCTION 
to  improve  the  speed  of  calculation  for  wires.  The  cylinder,  plate,  cone,  disk  and  sphere  matrices 
all  took  about  the  same  time  to  fill  a  given  size  matrix. 

When  the  sample  problems  were  run  on  four  processors,  distinct  structural  dependencies 
emerged  for  filling  times  and  efficiencies.  It  is  important  to  note  that  column  block  mapping  always 
performed  better  than  column  wrap  mapping.  Column  block  mapping  filling  efficiencies  were 
almost  always  greater  than  70%,  whereas  column  wrap  mapping  filling  efficiencies  were  always 
between  40  and  60%.  It  is  also  interesting  to  note  that  the  different  structures  performed  differently 
in  filling  efficiencies.  For  column  wrap  mapping  wires  performed  best  followed  by  plates,  cones, 
spheres,  disks,  and,  then,  cylinders.  For  column  block  mapping  wires  again  performed  best  followed 
by  cylinders,  plates,  cones,  spheres,  and,  then,  disks.  These  results  are  due  to  the  nature  of  the 
physical  connectivity  of  the  various  structures  and  how  their  elements  are  distributed  to  the  columns 
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Table  6-3.  Matrix  filling  times/efficiencies  for  various  structures. 


Structure 

Number  of 

1  Processor 

4  Processors  ( 

Type 

Unknowns 

Wrap  Mapping 

Block  Mapping  | 

TIME, sec 

TIME 

EFFIC 

TIME 

EFFIC 

WIRE 

47 

7.38 

3.32 

55.6% 

1.97 

93.7% 

97 

30.78 

14.06 

54.7% 

8.08 

95.2% 

147 

70.51 

31.54 

55.9% 

17.95 

98.2% 

197 

126.06 

56.91 

55.4% 

32.22 

97.8% 

CYLINDER 

56 

8.56 

4.97 

KH 

2.80 

76.4% 

104 

27.04 

15.78 

K«8 

7.94 

85.1% 

152 

55.95 

32.69 

ESs 

15.68 

89.2% 

200 

95.17 

55.62 

42.8% 

25.99 

91.5% 

PLATE 

40 

5.12 

2.66 

48.1% 

1.66 

77.1% 

96 

25.11 

12.69 

49.5% 

7.88 

79.7% 

133 

45.7 

25.95 

44.0% 

14.01 

81.5% 

176 

76.87 

39.15 

49.1% 

21.92 

87.7% 

CONE 

52 

7.15 

4.08 

43.8% 

2.54 

70.4% 

100 

24.09 

13.88 

43.4% 

7.36 

81.8% 

155 

55.85 

29.12 

47.9% 

16.56 

84.3% 

200 

90.52 

46.58 

48.6% 

25.31 

89.4% 

DISK 

49 

6.60 

3.77 

2.77 

59.6% 

98 

25.19 

13.7 

60.4% 

150 

55.65 

29.84 

IF- 

8B9 

69.4% 

200 

97.01 

53.12 

45.7% 

■Bififis. 

70.6% 

SPHERE 

60 

9.06 

4.82 

47.0% 

3.23 

70.1% 

90 

19.75 

10.65 

46.4% 

6.85 

72.1% 

168 

66.56 

37.98 

43.8% 

20.27 

82.1% 

198 

91.84 

47.99 

47.8% 

28.96 

79.3% 

of  the  matrix.  Again,  in  JUNCTION  calculations  are  made  with  respect  to  faces  and  segments.  To 
avoid  duplication  of  computation  on  the  processors  it  is  advantageous  to  have  all  the  edge  com¬ 
putations  for  a  given  face  to  be  on  the  same  processor.  For  a  wire  both  nodes  of  a  given  segment 
should  be  on  the  same  processor. 

Table  6-4  shows  the  measured  times  and  calculated  efficiency  for  matrix  factoring  and  solving 
for  the  different  structures. 

In  matrix  factoring  and  solving  there  are  no  structural  dependencies.  It  should  also  be  noted 
that  as  the  number  of  unknowns  increases,  the  factoring  and  solving  efficiencies  increase.  For  200 
unknowns  factoring  and  solving  efficiency  is  greater  than  95%  percent.  Matrix  filling  efficiency 
also  increases  as  the  number  of  unknowns  increases. 
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Table  6-4.  Matrix  factor  and  solve  times/efficiency  for  various  structures. 


6.3.2  Heterogeneous  structures 

The  heterogeneous  structures  that  were  evaluated  consisted  mostly  of  square  plates  with 
numerous  attached  wires.  This  allowed  the  evaluation  of  matrices  with  various  numbers  of  body, 
wire,  and  junction  columns. 

After  evaluating  a  large  number  of  heterogeneous  problems  a  number  of  observations  were 
made.  These  observations  are  summarized  below: 

•  The  junction  columns  are  scattered  within  the  wire  portion  of  the  matrix.  The 
distribution  of  the  junction  columns  depends  on  how  the  problem  is  initially  spe¬ 
cified.  Fortunately,  it  is  fairly  straightforward  to  convert  from  junction  number  to 
matrix  column  number. 

•  A  junction  can  connect  to  between  one  and  six  faces  on  a  body.  A  quirk  of 
JUNCTION  is  that  the  code  will  only  recognize  a  maximum  of  one  junction  attached 
to  a  given  face. 
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•  To  achieve  maximum  efficiency  the  filling  of  a  junction  column  should  be  done  by 
the  same  processor  which  is  filling  the  columns  associated  with  the  faces  to  which 
the  junction  is  attached,  since  a  large  part  of  the  calculations  are  in  common.  The 
problem  is  that  the  associated  body  columns  are  sometimes  scattered  all  through 
the  matrix. 

•  The  time  to  fill  a  matrix  is  a  function  of  many  factors  including:  relative/absolute 
number  of  wire,  body  and  junction  unknowns;  number  of  separate  wires;  number 
of  segments  on  each  wire;  locations  of  wires  and  junctions  relative  to  a  body;  number 
of  faces  connected  to  each  junction. 

•  Developing  a  general  method  for  balancing  the  work  load  and  achieving  optimum 
efficiency  for  a  heterogeneous  problem  is  very  complicated,  even  for  the  simple 
problems  studied  involving  a  single  plate  with  various  attached  wires.  If  the  choice 
is  between  block  and  wrap  mapping,  block  mapping  gives  better  results.  (For  the 
problems  studied,  block  mapping  was  62  -  93%  efficient,  whereas  wrap  mapping 
was  41  -  52%  efficient). 

Based  on  the  above  observations  a  decision  was  made  to  implement  two  additional  column  mapping 
techniques.  Both  techniques  are  modifications  of  the  standard  block  mapping  technique. 

The  first  technique  is  called  random  mapping.  Random  mapping  is  block  mapping  with  one 
twist:  instead  of  the  junction  columns  staying  grouped  with  the  wire  columns,  the  junction  columns 
are  filled  by  the  processors  which  are  filling  the  body  columns  of  the  matrix.  The  columns  are 
redistributed  to  keep  a  balanced  number  of  columns  on  each  processor.  As  an  example,  suppose  a 
matrix  has  50  body  columns,  50  wire  columns,  and  6  junctions,  and  the  problem  is  being  run  on  4 
processors.  Under  standard  column  block  mapping  the  first  25  body  columns  would  be  filled  by 
the  first  processor,  the  second  25  body  columns  would  be  filled  by  the  second  processor,  the  first 
25  wire  columns  would  be  filled  by  the  third  processor,  and  the  second  25  wire  columns  would  be 
filled  by  the  fourth  processor.  The  junction  columns,  being  wire  columns  also,  would  be  distributed 
on  the  third  and  fourth  processors.  Under  random  mapping  the  six  junction  columns  would  first 
be  distributed  to  the  first  two  processors  in  the  following  manner:  junction  1  column  to  first  pro¬ 
cessor,  junction  2  column  to  second  processor,  junction  3  column  to  first  processor,  junction  4 
column  to  second  processor,  and  so  on.  The  first  and  second  processors  would  each  have  3  junction 
columns.  Block  processing  would  then  be  used  to  distribute  the  body  and  remaining  wire  columns, 
so  that  each  processor  would  finish  up  with  25  total  columns  as  before. 

The  second  technique  is  called  1st  order  mapping.  This  is  similar  to  the  random  mapping 
described  above  except  now  the  junction  columns  are  assigned  to  the  body  processors  using 
knowledge  of  which  processor  will  be  calculating  the  majority  of  columns  associated  with  the  faces 
to  which  that  junction  is  attached.  This  could  potentially  cause  more  junction  columns  to  end  up 
on  one  of  the  body  processors  than  another,  but  the  improvement  in  performance  could  be  significant. 

An  analysis  of  matrix  filling  efficiency  was  run  on  four  processors  for  the  problem  of  a  square 
plate  with  8  wires  attached  to  it.  The  problem  involved  96  body  unknowns,  96  wire  unknowns, 
and  8  junctions.  The  four  different  column  mapping  techniques  were  used.  The  results  are  shown 
in  table  6-5. 

The  results  show  that  random  mapping  only  gives  a  slight  improvement  over  block  mapping, 
but  1st  order  mapping  gives  a  significant  improvement  over  block  mapping. 
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Table  6-5.  Efficiencies  of  various  mapping  techniques. 


Mapping  Technique 

Efficiency 

WRAP 

40.5% 

BLOCK 

61.8% 

RANDOM 

63.9% 

1ST  ORDER 

76.9% 

7.0  RESULTS  AND  CONCLUSIONS 
7.1  Performance  Comparisons 

Comparisons  were  run  for  the  micro-Vax,  a  single  transputer,  the  four  transputer  array,  and 
the  Convex  Model  C-220.  The  Convex  is  a  mini-supercomputer  which  can  do  some  automatic 
vectorization. 

Sample  data  sets  were  generated  for  both  bodies  and  wires.  Eight  body  data  sets  were  generated 
with  104,  152,  200,  248,  296,  356,  404,  and  452  unknowns.  Eight  wire  data  sets  were  generated 
with  47,  97,  147,  197,  247,  297,  347,  and  397  unknowns.  For  each  data  set  and  each  hardware 
platform  timing  measurements  were  made  for  both  matrix  filling  and  matrix  factoring  and  solving. 
On  the  four  transputer  array  the  data  sets  were  run  using  both  column  wrap  mapping  and  column 
block  mapping.  On  the  Convex  measurements  were  made  of  both  CPU  time  and  actual  elapsed 
time.  At  the  time  the  measurements  were  made  there  were  1 3  other  users  on  the  Convex  and  elapsed 
time  averaged  about  3.5  times  longer  than  CPU  time.  This  should  be  kept  in  mind  when  making 
comparisons  between  the  various  hardware  platforms.  Results  are  shown  in  tables  7-1  through  7-4. 


Table  7-1.  Time  to  fill  all-body  matrix,  seconds. 


Number 

of 

Unknowns 

COMPUTER  PLATFORM  | 

Micro-Vax 

1  Xputer 

4  Xputer 

CONVEX  | 

Wrap  Map 

Block  Map 

CPU 

Elapsed 

104 

73.7 

26.6 

16.7 

8.8 

5.1 

20 

152 

151.7 

55.0 

33.4 

16.5 

10.7 

26 

200 

257.5 

93.8 

56.1 

26.6 

18.3 

69 

248 

394.0 

85.4 

39.7 

27.9 

105 

296 

555.9 

120.1 

54.9 

39.4 

145 

356 

801.6 

172.8 

78.0 

56.9 

201 

404 

1027.8 

221.4 

99.1 

73.1 

260 

452 

1282.3 

276.1 

122.9 

91.4 

329 
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Table  7-2.  Time  to  factor  &  solve  all-body  matrix,  seconds. 


Number 

of 

Unknowns 

COMPUTER  PLATFORM  | 

Micro-Vax 

1  Xputer 

4  Xputer 

CONVEX  | 

Wrap  Map 

Block  Map 

CPU 

Elapsed 

104 

15.1 

6.7 

1.8 

1.8 

0.5 

2 

152 

45.5 

20.2 

5.3 

5.3 

1.3 

4 

200 

101.8 

42.7 

11.7 

11.8 

2.8 

7 

248 

191.9 

21.9 

22.1 

5.1 

32 

296 

323.7 

36.9 

37.1 

8.5 

30 

356 

559.9 

63.6 

63.9 

14.5 

55 

404 

814.3 

92.5 

92.8 

21.0 

79 

452 

1137.2 

129.0 

129.4 

29.1 

102 

Table  7-3.  Time  to  fill  all-wire  matrix,  seconds. 


Number 

of 

Unknowns 

COMPUTER  PLATFORM  | 

Micro-Vax 

1  Xputer 

4  Xputer 

CONVEX  | 

Wrap  Map 

Block  Map 

CPU 

Elapsed 

47 

19.1 

7.3 

3.3 

2.0 

0.9 

1 

97 

78.9 

30.6 

14.0 

8.0 

3.6 

9 

147 

180.5 

70.1 

31.4 

17.9 

8.3 

23 

197 

322.4 

125.4 

56.7 

32.0 

14.9 

54 

247 

505.3 

88.0 

49.6 

23.4 

112 

297 

729.3 

127.8 

71.9 

33.8 

154 

347 

993.7 

173.1 

97.2 

46.0 

166 

397 

1302.1 

227.8 

127.8 

60.2 

172 

The  results  show  that  the  micro-Vax  fills  a  wire  matrix  32%  slower  than  it  fills  a  body  matrix 
of  the  same  size.  Using  column  wrap  mapping  the  four  transputers  fill  a  wire  matrix  8%  slower 
than  they  do  a  body  matrix.  In  contrast,  the  Convex  fills  a  wire  matrix  15%  faster  than  it  does  a 
body  matrix!  This  may  be  due  to  the  Convex’s  automatic  vectorizing  abilities.  For  comparison,  a 
single  transputer  fills  a  wire  matrix  36%  slower  than  it  fills  a  body  matrix.  The  8%  figure  for  the 
four  transputers  can  be  accounted  for  by  the  fact  that  the  wire  filling  efficiency  (55%)  is  about  28% 
greater  than  the  body  filling  efficiency  (40%)  for  these  problems  and  28+8=36.  The  conclusion 
here  is  that  for  the  JUNCTION  code,  sequentially  speaking,  wire  filling  takes  about  one  third  longer 
than  body  filling. 

Some  of  the  comparison  data  is  displayed  graphically  in  figures  7-1  through  7-3.  The  data 
is  for  an  all-body  matrix.  The  figures  show  comparisons  between  the  four  transputer  array,  the 
Convex  (CPU  time),  and  projected  results  for  a  sixteen  transputer  array.  A  four  transputer  array 
with  1  Megabyte  of  RAM  per  node  can  solve  problems  with  up  to  450  unknowns.  With  a  sixteen 
transputer  array  problems  with  more  than  900  unknowns  could  be  solved. 
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Table  7-4.  Time  to  factor  &  solve  all-wire  matrix,  seconds. 


Number 

COMPUTER  PLATFORM  j 

of 

Micro-Vax 

1  Xputer 

4  Xputer 

CONVEX  | 

Unknowns 

Wrap  Map 

Block  Map 

CPU 

Elapsed 

47 

1.6 

0.7 

0.2 

0.2 

0.1 

0 

97 

12.3 

5.3 

1.5 

1.5 

0.4 

1 

147 

41.2 

18.1 

4.8 

4.8 

1.2 

5 

197 

97.4 

42.9 

11.1 

11.1 

2.7 

10 

247 

189.9 

21.5 

21.5 

5.0 

26 

297 

327.8 

37.0 

37.0 

8.6 

35 

347 

520.2 

58.6 

58.6 

13.4 

39 

397 

776.0 

87.4 

87.4 

19.8 

57 

The  plots  show  that  a  sixteen  transputer  array  would  perform  better  than  the  Convex  for  filling 
the  matrix  and  almost  as  good  as  the  Convex  for  factoring  and  solving  the  matrix  (Note:  a  sixteen 
transputer  array,  including  the  motherboard,  would  cost  about  $15K).  However,  no  attempt  was 
made  to  improve  the  vectorization  of  JUNCTION  for  a  more  efficient  computation  on  the  Convex. 
It  is  envisioned  that  considerable  improvement  could  be  made  in  the  Convex  times  for  factoring 
and  solving  of  the  matrix. 


NUMBER  OF  BODY  UNKNOWNS 


4  TRANSPUTERS  CONVEX(CPU)  - 16  TRANSPUTERS 


Figure  7-1.  Matrix  filling  time  comparisons,  block  mapping. 
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NUMBER  OF  BODY  UNKNOWNS 


— t—  4  TRANSPUTERS  -**-  CQNVEX(CPU)  -  16  TRANSPUTERS 


Figure  7-2.  Matrix  factoring  and  solving  time  comparisons. 


NUMBER  OF  BODY  UNKNOWNS 


— 4  TRANSPUTERS  CONVEX(CPU)  - 16  TRANSPUTERS 


Figure  7-3.  Total  computation  time  comparisons,  block  mapping. 
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7.2  Observations 

In  the  course  of  this  effort  a  number  of  observations  were  made  regarding  transputers,  ParaSoft 
EXPRESS,  and  parallel  processing  in  general: 

•  A  transputer  array  is  an  inexpensive,  flexible  parallel  processing  platform.  An 
array  of  four  transputer  modules,  each  with  1  Megabyte  of  RAM,  plus  the  PC 
motherboard  cost  less  than  $5K.  The  transputer  array  is  very  flexible  since  the 
number  of  processors  can  be  easily  expanded  (just  plug  more  modules  in  the  sockets) 
and  any  Fortran  or  C  code  which  runs  on  the  PC  can  be  modified  to  run  on  the 
transputer  array. 

•  ParaSoft  EXPRESS  has  been  of  great  utility.  Without  EXPRESS  it  would  have 
been  very  difficult  to  make  the  progress  that  was  made  in  parallelizing  the  code.  In 
addition,  major  modifications  to  the  existing  code  would  have  been  required,  if  not 
a  complete  rewriting  of  the  code.  EXPRESS  will  allow  the  porting  of  the  code  to 
another  computer  platform  to  be  much  more  direct. 

•  Running  an  existing  program  on  one  transputer  is  very  straightforward  using 
EXPRESS.  By  adding  just  a  few  lines  of  code,  then  recompiling  using  the  parallel 
compiler,  and  linking  the  object  files  to  EXPRESS,  an  executable  module  which 
runs  on  a  single  transputer  can  be  created  for  virtually  any  Fortran  program  which 
runs  on  the  PC. 

•  The  effort  required  to  parallelize  a  code  is  algorithm  dependent.  Some  algorithms 
can  be  parallelized  just  by  adding  a  couple  of  lines  of  code.  Other  algorithms  require 
extensive  restructuring.  Still  other  algorithms  are  not  suited  for  implementation  on 
a  parallel  computer  and  must  be  left  as  sequential  code.  It  is  not  always  obvious 
beforehand  which  algorithms  will  be  efficient  to  parallelize  and  which  won’t. 

•  The  optimal  parallel  code  can  be  dependent  on  the  number  and  type  of  processors 
as  well  as  the  size  of  the  problem  being  solved.  This  was  shown  to  be  the  case  with 
the  matrix  solve  algorithms.  Two  algorithms  were  tested:  a  cyclic  implementation 
and  a  wavefront  implementation.  Although  the  cyclic  implementation  outper¬ 
formed  the  wavefront  implementation  for  the  4-transputer  array,  the  literature 
suggests  that  the  wavefront  implementation  will  be  the  best  performer  for  a 
massively  parallel  computing  platform  solving  larger  problems. 

8.0  THE  FUTURE 

The  state-of-the-art  technology  in  the  area  of  parallel  processing  changes  monthly  both  for 
hardware  platforms  and  software  tools.  The  transputers  used  for  this  effort  will  soon  be  outdated, 
replaced  by  faster  and  more  flexible  processing  units  [Pountain,  1990].  New  features  are  contin¬ 
uously  being  added  to  EXPRESS  to  make  it  more  powerful  and  adaptable. 

8.1  Future  Hardware 

The  big  news  in  the  world  of  transputers  is  the  development  of  the  T9000.  This  next  generation 
processor  is  being  developed  by  the  same  company,  INMOS,  which  did  the  original  pioneering 
work  on  the  development  of  the  transputer  {INMOS,  1988].  According  to  the  manufacturer  the  key 
features  of  the  T9000  are  "a  high  performance  pipelined  superscalar  processor  and  major  support 
for  multiprocessing  applications.  Peak  performance  will  be  more  than  1 50  MIPS  and  20  MFLOPS, 
representing  a  major  advance  in  parallel  computing  and  high  speed  communications. ..The  design 
goals  for  T9000  were  to  enhance  the  transputer’s  position  as  the  premier  multiprocessing  micro- 
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processor,  and  to  establish  a  new  standard  in  single  processor  performance,  while  maintaining 
compatibility  with  existing  transputer  products."  Of  course,  there  will  be  some  delay  between  the 
release  of  the  T9000  and  the  development  of  compatible  motherboards  and  software. 

8,2  Future  Software 

A  great  deal  of  effort  is  going  into  developing  programming  software  for  parallel  computing 
environments.  This  software  includes  parallel  compilers,  debuggers,  performance  analysis  tools, 
visualization  tools,  dynamic  load  balancing  tools,  and  automatic  parallelization  tools.  Most  of  the 
presently  existing  software  in  these  areas  are  very  crude. 

ParaSoft  is  in  the  process  of  improving  EXPRESS.  New  tools  which  are  being  added  include 
VTOOL,  ASPAR,  and  EXDIST.  VTOOL  will  allow  memory  access  visualization.  ASPAR  pro¬ 
vides  automatic  parallelization  of  sequential  C  programs.  EXDIST  will  provide  dynamic  load 
balancing.  ParaSoft’s  ultimate  goal  is  to  run  EXPRESS  within  a  heterogeneous  parallel  processing 
environment  -  known  as  a  "meta"  computer.  A  "meta"  computer  is  made  up  of  a  number  of 
architecturally  different  computers  which  are  networked  together  and  perform  as  a  single  entity. 
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Abstract 

Eigenfunction  expansions  for  fields  scattered  by  large  structures  are  generally  very 
slowly  convergent.  The  summation  often  consists  of  two  factors  where  one  factor 
approaches  zero  and  the  other  factor  grows  in  magnitude  without  bound  as  the  sum¬ 
mation  index  increases.  Each  term  of  the  expansion  is  bounded;  however,  due  to  the 
extreme  magnitude  of  the  individual  factors,  computational  overflow  and  underflow 
errors  can  limit  the  number  of  terms  that  can  be  computed  in  the  summation  thereby 
forcing  the  summation  to  be  terminated  before  it  has  converged.  In  this  paper  an 
exact  technique  that  circumvents  these  problems  is  presented.  An  auxiliary  function 
is  introduced  which  is  proportional  to  the  original  factor  with  its  asymptotic  behavior 
factored  out.  When  these  auxiliary  functions  are  introduced  into  the  summation,  we 
are  left  with  the  task  of  numerically  summing  products  of  well  behaved  factors.  A 
recursion  relationship  is  developed  for  computing  this  auxiliary  function. 


1  Introduction 

When  solving  for  the  fields  scattered  by  canonical  geometries,  the  exact  solution  is  often 
available  in  eigenfunction  form.  The  eigenfunction  form  is  a  viable  representation  for  the 
fields  providing  that  some  characteristic  dimension  of  the  structure  is  “small”  with  respect 
to  the  wavelength,  otherwise,  the  eigenfunction  expansion  is  very  slowly  convergent  and 
additionally  can  exhibit  the  following  computational  difficulty.  These  pathological  eigen¬ 
function  expansions  are  in  the  form  of  infinite  summations  of  products  and  quotients.  Each 
term  of  the  summation  is  well-behaved,  however,  the  magnitude  of  the  individual  factors 
and/or  divisors  become  either  too  small  or  too  large  to  handle  on  the  computer  result¬ 
ing  in  overflow/overflow  errors.  Ideally,  one  should  find  an  alternate  representation  of  the 
series  that  is  more  quickly  convergent  using  a  technique  such  as  Watson’s  transformation 
[' Tyras ,  1969].  For  many  geometries  the  topology  of  the  characteristic  plane  may  be  too 
complicated  to  perform  the  necessary  function  theoretic  manipulations.  Therefore,  one  may 
be  forced  to  sum  the  slowly  convergent  series. 
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The  overflow/ underflow  problem  can  be  alleviated  by  implementing  an  auxiliary  function 
that  is  proportional  to  the  original  pathological  function  with  the  asymptotic  behavior 
factored  out.  These  auxiliary  functions  can  be  calculated  “exactly”  from  new  recursion 
relationships  which  are  derived  from  the  recursion  relationship  of  the  original  function.  In 
Section  2,  the  development  of  these  auxiliary  functions  and  the  corresponding  recursive 
techniques  will  be  presented.  Section  3  contains  an  example  of  the  plane  wave  scattering 
by  a  circular  cylinder  using  the  techniques  developed  in  this  paper  and  Section  4  contains 
some  concluding  remarks.  Throughout  this  paper  an  eJUJt  time  dependence  is  assumed  and 
suppressed. 

2  Analytic  Formulation 

Symbolically,  an  eigenfunction  expansion  often  has  the  form 

4>  =  X!  Cn  Sn(x)  Ln(x),  (1) 

n 

where  Cn  is  a  well-behaved  constant,  Sn(x)  is  a  factor  that  becomes  increasingly  small  as  n 
grows, 

Sn(x)  - ^  0,  (2) 

and  Ln(x)  is  a  factor  that  grows  without  bound  with  increasing  n, 

Ln{x)  n- oo,  (3) 

such  that  the  product,  Sn(x')  Ln(x),  is  bounded  and  the  sum  CnSn(x)Ln(x)  is  convergent. 

2.1  Computation  by  Recursion 

Recursion  relationships  are  very  convenient,  and  are  a  common  way  to  calculate  the  Sn(x) 
and  Ln{x)  functions.  Typically,  the  functions  that  arise  in  eigenfunction  solutions  to  elec¬ 
tromagnetic  problems  satisfy  a  three  term  recursion  relationship  which  can  be  expressed  as 
follows: 


A(n,  x)  yn(r)  +  B(n ,  x)  yn+i(x)  +  C(n,x)  yn-i(x)  =  0.  (4) 

For  a  fixed  value  of  x,  the  recursion  relation  can  be  treated  as  a  difference  equation  in  n.  A 
three,  term  difference  equation  has  2  independent  solutions  [Press,  et  al.,  1988],  i.e: 

:'/»(■ r)  =  {P„(x)XUx)}.  (5) 

So  the  general  solution  to  the  recursion  relation  is: 

Vn{x)  =  a  Pn(x)  4-  ft  Qn{x)  (6) 


68 


Figure  1:  Relative  magnitude  trends  of  Pn  and  Qn- 


where  a  and  0  are  constants  that  need  to  be  determined  by  physical  considerations. 

Miller’s  algorithm  [Press,  et  al,  1988]  is  a  commonly  used  technique  to  generate  the 
sequence  {Pn(x)}  or  {Qn(x)}.  The  algorithm  begins  by  arbitrarily  choosing  two  successive 
values  to  substitute  into  the  recursion  relationship  which  is  used  to  generate  the  entire 
sequence.  For  example,  let 

y0(x)^C0{x)  and  yi(x)  =  Cx{x),  (7) 

which  when  substituted  into  Equation  (6)  yields: 

Co(x)  =  a  P0{x)  +  0Qo{x),  (8) 

Ci(x)  =  a  P\{x)  +  0  Q\(x).  (9) 

Since  Pn(x)  and  Qn(x)  are  known  to  be  independent,  then  by  Equations  (8)-(9)  the  values 
of  a  and  0  are  determined  uniquely,  hence  yn{x)  is  well-defined.  Note  that  yn(x)  contains 
components  of  both  solutions,  {Pn{x),Qn{x)},  providing  that  the  choice  of  Cq{x)  and  C\(x) 
are  not  proportional  to  either  Po(x)  and  Pi(x),  respectively  or  Qo{x)  and  Qi{x),  respectively. 


r  i  r  i _1  r  i 

Q  Po  Qo  Co  Po  Q°  ^ 

_  0  [  Pi  Qi  J  [  Ci  ’  Pi  Qi 


(10) 


At  this  point  it  is  necessary  to  examine  the  stability  of  the  desired  solution.  Stability 
refers  to  the  relative  rate  of  growth  of  the  magnitude  of  the  desired  solution  relative  to  the 
non-desired  solution.  Let’s  examine  the  common  circumstance  where  |Pn(x)|  increases  as  n 
increases  and  |Qn(x)|  decreases  as  n  increases  as  shown  in  Figure  1.  It  is  clear  that  since  our 
solution,  yn(x),  contains  components  of  both  Pn(x )  and  Qn(x ),  then  if  we  consider  larger 
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and  larger  values  of  n,  Qn (x)  will  be  less  and  less  significant  compared  to  Pn(x),  therefore: 

yn(x)  ~  a  Pn(x) ;  n  ->  oo.  (11) 

Under  these  circumstances,  Pn(x)  is  said  to  be  stable  recursing  up  (in  n).  Similarly  we 
could  have  arbitrarily  chosen  values  for  yn(x )  for  two  large  successive  values  of  n,  and  then 
recursed  down  thereby  recovering  the  solution  for  Qn(x).  In  this  case  Qn(x)  is  said  to  be 
stable  recursing  down.  Mathematically, 

yn(x)  ~  (3Qn(x) ;  n  —  -oo.  (12) 

This  process  generates  a  sequence  proportional  to  the  desired  P„(x)  and  Qn(x)  sequences. 
Pn(x)  or  Qn(x)  can  be  recovered  by  multiplying  the  entire  sequence  {aPn(a:)}  or  {pQn(x)} 
by  a  or  respectively,  a  or  (3  can  be  determined  by  calculating  one  value  of  (Pn(x)}  or 

{<2n(z)}  and  comparing  it  to  the  corresponding  value  of  {aP„(x)}  or  {/?<2»(:r)}  which  was 

calculated  by  recursion.  An  alternative  is  to  use  a  normalization  relationship  of  the  form: 

^2 7n  Pn(z)  =  1  or  ]T(5n  Qn(x)  =  1.  (13) 

n  Tl 

In  the  introduction  of  Abramowitz  and  Stegun  [1972,  p.  XIII],  there  is  a  listing  of  many 
functions  and  their  direction  of  recursive  stability,  and  in  the  various  chapters  corresponding 
to  the  functions  of  interest,  normalization  relations  of  the  form  in  Equation  (13)  can  be 
found. 

An  alternative  to  Miller’s  algorithm  can  be  applied  if  two  successive  values  of  the  solution 
are  known.  If  the  desired  solution  contains  only  one  component  of  the  two  independent 
solutions  {Pn(x),  Q„(x)},  then  it  is  necessary  to  recurse  in  the  direction  of  recursive  stability. 
Ideally,  the  direction  of  recursion  should  not  matter,  however,  due  to  round  off  error  in  the 
computer,  the  undesired  solution  will  be  present  and  eventually  grow  to  a  significant  value 
relative  to  the  desired  component  of  the  solution. 

As  mentioned  previously,  the  magnitude  of  the  individual  functions  which  we  need  to 
calculate  can  be  either  too  large  or  too  small  for  the  computer  to  handle.  This  is  why  we 
introduce  the  auxiliary  functions  in  Section  2.2. 

2.2  The  Auxiliary  Functions 

Since  Sn(x )  and  Ln{x)  are  computed  separately,  before  the  sum  converges  individually  they 
can  become  either  too  small  or  too  large  to  calculate  (due  to  computer  limitations).  To 
remove  this  upper  bound  limitation  on  the  index  of  the  summation,  n,  we  first  note  the 


asymptotic  behavior  of  Sn(x )  and  Ln(x). 

Sn(x)  a  <7n(x),  n  — ►  oo, 

(14) 

and, 

Ln(x)  oc  An(x),  n  — >  oo. 

(15) 
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Introduce  the  auxiliary  functions  Af(x)  and  A„(x)  which  are  defined  for  an  appropriate 
interval  over  n  by  the  following  expressions: 


Sn(x ) 

=  ^n(x)an(x), 

(16) 

Ln(x) 

=  A^(x)  A„(x). 

(17) 

Since  the  auxiliary  functions  are  equal  to  the  original  pathological  functions  with  the  asymp¬ 
totic  behavior  factored  out,  the  auxiliary  functions  remain  well-behaved  for  all  n  and  are 
therefore  computationally  preferable  over  the  original  Sn(x)  and  Ln(x)  functions.  These 
expressions  (Equations  (16)  and  (17))  for  Sn(x)  and  Ln(x )  can  be  substituted  into  the 
eigenfunction  expansion,  Equation  (1),  yielding: 

(j)  =  ^CnAfl{x)an{x)  A^{x)  An{x),  (18) 

n 

or, 

0  =  (is) 

n 

where, 

C'n(x)  =  Cnan{x)  An(x).  (20) 

The  expression  for  C'n(x)  can  usually  be  significantly  simplified  which  eliminates  the  ne¬ 
cessity  to  compute  the  extremely  small  and  extremely  large  values  for  an(x)  and  A„(x), 
respectively  as  n  — ♦  oo. 

The  three  factors  in  each  term  of  Equation  (19)  are  well-behaved  for  large  values  of 
n  which  makes  this  procedure  convenient  for  computing  a  slowly  convergent  eigenfunction 
expansion. 

It  is  common  practice  to  extract  a  factor  from  functions  that  grow  without  bound.  The 
main  contribution  of  this  paper  is  the  computation  procedure  for  the  auxiliary  functions 
which  is  presented  in  the  next  section. 

2.2.1  Computation  of  the  Auxiliary  Functions  by  Recursion 

The  direction  of  recursive  stability  of  an  auxiliary  function  is  the  same  as  the  function 
from  which  it  was  derived,  therefore  Miller’s  algorithm  can  also  be  applied  to  auxiliary 
functions.  We  begin  by  assuming  that  the  recursion  relationship  for  the  original  function 
(Equation  (4))  is  known.  If  we  are  trying  to  calculate  A£(x),  the  auxiliary  function  for  Ln(x), 
then  we  simply  substitute  Equation  (17)  into  the  recursion  relationship,  Equation  (4). 

An(x)  An(x)  A^{x)  +  Bn(x)  An+i(x)  A^+l(x)  +  Cn(x )  An_!(x)  A^_ x{x)  =  0  (21) 

In  this  form,  the  recursion  relationship  would  experience  the  same  overflow  and  under¬ 
flow  difficulties  as  the  original  function  (due  to  the  asymptotic  factors  An(x),  A„+i(:r)  and 
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Figure  2:  Plane  wave  incident  upon  PEC  circular  cylinder. 

An_!(x)).  It  can,  and  must,  be  analytically  simplified  at  this  stage  to  avoid  taking  differ¬ 
ences  of  very  large  numbers  which  is  a  computational  faux  pas.  Section  3  illustrates  this 
procedure  by  presenting  an  example  which  calculates  the  plane  wave  scattering  by  a  circular 
cylinder. 


3  Example:  Scattering  by  a  PEC  Circular  Cylinder 


In  this  section  we  are  presenting  an  example  that  applies  the  general  technique  outlined  in 
this  paper  to  the  specific  problem  of  determining  the  total  (incident  +  scattered)  ^-directed 
electric  field  when  a  TM2  plane  wave  is  incident  upon  a  perfect  electric  conducting  (PEC) 
circular  cylinder  which  has  its  axis  along  the  2-axis.  The  incident  held  is  given  by: 

OO 

E\  =  E0e~jkx  =  E0e-jkpcos,t>  =  E0  ^  j'V„(i/))ej¥,  (22) 


where  Eq  is  the  complex  amplitude  and  the  coordinates  x,  p  and  0  are  shown  in  Figure  2. 
The  total  2-directed  field  is  given  by  [ Harrington ,  1961]: 


E,  =  Bo  £  r" 


Mkp)  - 


Hn\ka) 


oi™t> 


(23) 


It  is  possible  to  convert  this  series  into  a  more  quickly  converging  representation,  however, 
this  will  not  be  done  since  the  goal  of  this  section  is  to  illustrate  the  recursive  technique 
outlined  in  this  paper  by  a  simple  example.  An  additional  numerical  difficulty  which  will 
not  be  addressed  in  detail  here,  with  the  form  of  Equation  (23)  is  that  when  computing 
the  total  fields  near  the  cylinder,  p  a,  there  can  be  a  loss  of  significant  digits.  This 
may  be  overcome  by  computing  the  cross  product  directly  by  means  of  recursion  relations 
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[Abramowitz  and  Stegun ,  1972,  p.  361].  Note  that  this  problem  does  not  occur  when  com¬ 
puting  the  scattered  fields  alone. 

Using  the  relationship  that  Zn,  any  integer  order  Bessel  function  (Jn,Yn)  ,  H^), 
satisfies: 


Z-n  =  (-1  )nZn, 
we  can  express  the  total  fields  as: 


E,  = 

Jtin  \fCCLj 


n= 0 


cos  (n<p), 


(24) 


(25) 


where, 


— 


1  ;  n  =  0, 

2  ;  n  ^  0. 


(26) 


Notice  that  the  second  term  in  the  brackets  exhibits  the  behavior  that  is  discussed  in  this 
paper,  namely  that  each  individual  factor  grows  without  bound  (H^),  or  approaches  zero 
(Jn),  as  n  increases.  The  asymptotic  behavior  of  the  Bessel  functions  of  the  first  and  second 
kind  are  given  by  [Abramowitz  and  Stegun ,  1972,  p.  365]: 


Ux)  ~  -miW' = a'{x)'  n~°°' 


and, 


n  — >  oo, 


™  -  -£sp 

and  since, 

H?\x)  =  Mx)-jYn(x). 

Then, 

~  ^  (I)  ” = A“w'  " 

So  then  the  auxiliary  functions  are  defined  by: 

J"(i)  =  n=i’2-3'-'°°' 


oo. 


and, 


=  j\ n=  1,2,3,.  ..,oo. 


(27) 


(28) 


(29) 


(30) 


(31) 


(32) 
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All  Bessel  functions  satisfy  the  same  recursion  relationship  [Abramowitz  and  Stegun ,  1972, 
p.  365]: 


Zn-i(x)  +  Zn+i{x)  =  —Zn{x).  (33) 

Jb 

By  comparing  the  asymptotic  forms  of  Jjx)  and  Yn(x),  Jn(x)  is  downward  stable  and  Yn{x) 
is  upward  stable  [ Abramowitz  and  Stegun,  1972,  p.  XIII].  Since  the  Hankel  function  consists 
of  both  Jn(x)  and  Yn(x)  and  they  differ  in  their  direction  of  recursive  stability,  it  is  necessary 
to  decompose  the  Hankel  function  to  determine  Jn{x)  and  Yn(x)  separately. 

The  definitions  for  the  auxiliary  functions  (Equations  (31)  and  (32))  are  indeterminate 
for  n  —  0.  For  this  reason,  we  will  extract  the  n  =  0  term  of  Equation  (25)  and  begin  the 
sum  from  n  —  1. 


Ez  =  E0 


Joikp)  - 


AJn(kp) 


'eka2\n  AJn(ka)  H 
. 2np  )  WM)  "i  P) 


cos  (ncp).  (34) 


Calculation  of  A„(x) 

The  recursion  relation  which  defines  A„(x)  is  found  by  substituting  Equation  (31)  into 
Equation  (33).  After  algebraic  simplification,  the  following  form  of  the  recursion  relation  is 
obtained: 


AJn{x)  = 

nV  ’  \n  +  \) 


n+l/2 


(|ex)2 

(»  +  2)s 


(n 


n 


+  2 


n+l/2 


KAX) 


(35) 


where  e  is  the  base  of  the  natural  logarithm.  This  is  in  a  form  suitable  for  downward 
recursion. 

We  will  apply  Miller's  algorithm  to  determine  a  sequence  denoted  by  A^(x)  which  is 
proportional  to  the  desired  A„(x)  sequence.  We  choose  N  to  be  larger  than  the  maximum 
number  of  terms  expected  to  be  summed  by  M.  Also,  let: 


(36) 


and, 


AinM  =  0.  <37) 

Use  the  recursion  relationship  (Equation  (35))  to  determine  (A^(x)}  for  n  =  1, 2, 3, . . .  N  —  1. 
(A^(x)}  is  a  sequence  proportional  to  the  desired  sequence  (A^(x)}.  There  are  many  ways 
to  normalize  the  sequence  (A^(x)}.  Here  we  will  use  Equation  (31)  to  determine  Ax(x) 


which  will  then  be  compared  to  A/(x)  to  determine  the  constant 

ex  A{{x) 

P  2v/2ir  Ji(x)  ' 


of  proportionality,  0. 


(38) 
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So  then, 

(4I(*)}  =  n  =  1,2,3, ...  N  —  M.  (39) 

The  following  table  was  constructed,  using  these  procedures,  to  emphasize  the  favorable 
behavior  of  the  magnitude  of  the  auxiliary  functions  compared  to  the  magnitude  of  the 
Bessel  functions.  We  are  showing  two  arguments  of  the  auxiliary  function  for  a  wide  range 
of  orders.  This  table  illustrates  the  difficulty  in  computing  the  Bessel  functions  directly. 
The  triple  asterisk  indicates  that  this  term  is  larger  than  10308,  the  largest  number  our 
computer  can  handle. 


n 

Jn(  1) 

Jn(5) 

i 

4.4005  x  10" 1 

0.81157 

-3.2758  x  10-1 

-0.12083 

2 

1.1490  x  10"1 

0.88200 

4.6565  x  10-2 

0.014297 

5 

2.4976  x  10"4 

0.94323 

2.6114  x  lO-1 

0.31559 

20 

3.8735  x  10-25 

0.98405 

2.7703  x  10-11 

0.73798 

50 

2.9060  x  10-78 

0.99345 

2.2942  x  10"45 

0.88306 

100 

8.4318  x  10" 189 

0.99670 

6.2678  x  lO' 119 

0.93919 

200 

+  *  * 

0.99834 

4.7600  x  lO"296 

0.96898 

500 

*  *  * 

0.99927 

♦  *  * 

0.98737 

1000 

♦  *  * 

0.99965 

*  *  * 

0.99367 

Calculation  of  A„m(x) 

As  mentioned  previously,  the  Hankel  function,  H^(x),  consists  of  Bessel  functions  of 
the  first  and  second  kind,  Jn(x)  and  Yn(x),  respectively.  J„(x)  and  Yn(x)  differ  in  their 
direction  of  recursive  stability,  therefore,  they  must  be  computed  separately.  We  note  that 
as  a  consequence  of  Equations  (29)  and  (32): 


Mx)  = 

(40) 

v-w  -  >■ 

(41) 

From  these  relationships,  we  determine  that  Im{d^(2)(i)}  is  stable  recursing  down  and 
that  1le{A“m  (x)}  is  stable  recursing  up.  Substituting  Equation  (40)  into  Equation  (33) 
yields  the  following  recursion  relation  in  a  suitable  form  for  downward  recursion. 

lm{A“{2\x)}  =  2lm{A%+l{x)} 
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We  can  use  this  recursion  relation  along  with  Equation  (40)  with  n  =  1  (for  normalization), 
to  apply  Miller’s  algorithm. 

lm{Af2\x)}  =  Jx(x)  (43) 

The  calculation  of  7 Ze{A%m (x)}  requires  two  starting  values  and  the  recursion  relation¬ 
ship  since  it  is  upward  stable.  The  two  starting  values  are  obtained  by  substituting  n  =  1 
and  n  =  2  into  Equation  (41). 


7le{A?m{x)}  =  “\/fy  Yl(x),  (44) 

and, 

ne{A^m(x)}  = -y/i Y^x)'  (45) 

The  recursion  relation  is  obtained  by  substituting  Equation  (41)  into  Equation  (33) 
yielding  the  following  recursion  relationship  for  7le{A„(2)  (rr)}  in  a  suitable  form  for  upward 
recursion. 


fax) 

2 

L  , 

fn  —  2\ 

(n  -  2)2  1 

i  n  J 

(46) 


The  following  table  is  presented  for  a  comparison  of  the  Hankel  function  of  the  second 
kind  with  its  auxiliary  function.  We  are  showing  two  arguments  of  the  auxiliary  function 
for  a  wide  range  of  orders. 


n 

1 

2 

5 

20 

50 

100 

200 

500 


H?\l) 

(4.4005  +  >7.8121)  x  10_1 
(0.11490  +  jl.6507)  x  10° 
(0  0000  +>2.6041)  x  102 
(0.0000  +  >4.1140)  x  1022 
(0.0000  +>2.1911)  x  1077 
(0.0000  +  >3.7753)  x  10185 
*  *  * 

*  *  * 


(1) 


Hl2)(  5) 


1.3307  -  >0.74960 


(-3.2758  ->1.4786)  x  10”1 


1.3511  ->0.0940 


(0.46565  -  >3.6766)  x  10“ 


1.0831  +>0.0000 


(2.6114  +  >4.5369)  x  10_1 


1.0175  +>0.0000 


(0.0000  +  >5.9340)  x  108 


1.0068  +>0.0000 


(0.0000  +  >2.7888)  x  104 


1.0034  +>0.0000 


(0.0000  +  >5.0849)  x  10115 


1.0017  +>0.0000 


(0.0000  +>3.3446)  x  10292 


1.0007  +>0.0000 


1.0003  +  >0.0000 


AH™!,  5) 

-1.2594  +>2.7900 
-7.5237  ->0.9529 
5.8970  -  >3.3942 
1.3996  +>0.0000 
1.1381  +  >0.0000 
1.0661  +>0.0000 
1.0323  +>0.0000 
1.0128  +>0.0000 
1.0064  +>0.0000 
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4  Conclusion 


In  this  paper  we  introduce  a  technique  which  circumvents  the  need  to  compute  very  large  or 
very  small  special  functions  that  commonly  appear  in  eigenfunction  expansions.  This  was 
accomplished  by  introducing  a  set  of  well-behaved  auxiliary  functions  which  are  equal  to  the 
original  special  function  with  its  asymptotic  behavior  factored  out.  When  the  eigenfunction 
expansion  is  expressed  in  terms  of  these  auxiliary  functions,  the  summation  can  be  simplified 
resulting  in  well-behaved  factors  in  the  sum.  The  auxiliary  functions  can  be  computed  via 
modified  recursion  formulas. 

This  procedure  is  formally  exact  since  we  are  not  making  any  approximations  -  only 
substitutions.  Typically,  when  calculating  summations  of  the  type  addressed  here  without 
the  use  of  the  auxiliary  functions,  at  some  point  in  the  summation  (which  needs  to  be 
determined),  an  asymptotic  form  is  substituted  into  the  expression.  The  procedure  described 
herein  avoids  the  need  to  switch  functional  representations  thereby  eliminating  the  need  to 
determine  the  value  of  the  index  to  implement  the  asymptotic  representation. 

In  this  paper  we  have  restricted  our  development  of  the  auxiliary  functions  to  the  extrac¬ 
tion  of  the  asymptotic  form  of  the  function.  The  procedure  is  not  limited  to  this  asymptotic 
extraction.  Any  functional  form  that  is  convenient  for  the  formulation  of  the  problem  at 
hand  can  be  extracted  in  a  similar  manner. 
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’H-ORIENTED’  AND  ’B-ORIENTED’  METHODS 
IN  A  PROBLEM  OF  NONLINEAR  MAGNETOSTATICS: 
SOME  METHODOLOGICAL  REMARKS 


A.  Bossavit 

Electricity  de  France,  1  Av.  du  G£n6ral  de  Gaulle, 
92141  Clam  art 


ABSTRACT 

Problems  which  depend  on  a  small  parameter  in  their  formulation  can  often  be  studied  by  a  perturbation 
approach.  Whether  the  perturbation  is  "regular"  or  "singular"  is  important  in  many  respects.  In  magneto¬ 
statics,  due  to  some  inherent  duality,  both  kinds  of  perturbation  may  happen,  depending  on  the  chosen 
formulation  ("b-oriented”  vs.  "h-oriented”  methods).  Singularly  perturbed  problems  are  numerically  more 
difficult  than  regularly  perturbed  ones.  We  suggest  that  this  might  explain  why,  as  some  re<  nt  numerical 
observations  seem  to  suggest,  b-oriented  methods  should  give  better  accuracy  in  a  specific  ss  of 
nonlinear  magnetostatic  problems  at  high  permeability. 


INTRODUCTION 

This  paper  is  a  case  study,  which  butts  on  a  methodological  question  of  such  generality 
that  it  seems  almost  preposterous  to  address  it  in  written  form:  how  should  the  particular¬ 
ities  of  a  physical  problem  (here,  the  presence  of  "small  parameters")  guide  the  modeler  in 
the  selection  of  a  numerical  method?  This  is  a  basic  subject,  one  about  which  everybody 
has  definite  opinions  (and  even,  sometimes,  strong  feelings),  but  also  an  elusive  one, 
difficult  to  treat  in  a  comprehensive  way.  (This  is  why  there  are  relatively  so  few  books  or 
paper  collections  devoted  to  mathematical  modelling,  like  e.g.,  [  1,  2,  3,  6,  7,  9,  11,  13, 15, 
22].  Why  most  of  them  seem  so  remote  from  the  kind  of  mathematical  modelling  we  do  is 
a  puzzling  question.) 

One  might  say  that,  after  all,  this  is  quite  normal.  Why  should  "tricks  of  the  trade"  be 
honored  with  formal  dissertations?  But  a  moment  of  reflection  will  show  that  some  of  the 
most  powerful  of  such  tricks  deserve  to  be  thus  treated,  that  they  are,  and  that  it  is 
eventually  beneficial  for  all  those  concerned.  Take  Fourier  analysis,  for  instance.  On  one 
level,  it  is  an  efficient  dimensionality-reduction  device,  which  helps  replace  the  numerical 
solution  of  a  fully  three-dimensional  problem  with  a  series  of  simpler  ones,  in  one  or  two 
dimensions.  On  another  level,  it  is  the  practical  and  usable  by-product  of  a  majestic 
mathematical  theory:  harmonic  analysis  [  12,  17].  Is  this  a  coincidence?  Probably  not. 

The  development  of  the  theory  was  stirred  by  the  efficiency  of  what  was  at  the  time  a 
modelling  trick,  the  use  of  trigonometric  series  in  the  study  of  heat  transfer,  and  it  has 
payed  back  largely,  as  far  as  mathematical  modelling  is  concerned.  This  is  the  classical 
example  of  a  practical  modelling  tool — a  trade  trick — backed  by  a  strong  mathematical 
theory,  and  of  the  dialectics  of  their  historical  development.  Other  examples  will  come  to 
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mind  (the  "least  squares"  trick  in  relation  with  control  and  identification  theory,  some 
aspects  of  Geometrical  Diffraction  Theory,  some  statistical  tests  . .  .). 

The  small  parameter  issue  (an  instance  of  which  will  be  presented  in  a  moment)  might 
deserve  such  a  status.  Here  there  is,  no  doubt,  a  trick:  use  dimensional  analysis  to  expose 
"small  parameters",  identify  the  corresponding  terms  in  the  equations,  drop  them.  There 
is  also  a  grandiose  theory,  or  rather,  theories:  asymptotic  analysis,  perturbation  theory. 

But  the  theory  appears  incomplete,  it  lacks  unity,  and  the  trick  itself  is  far  from  being 
fail-safe:  when  your  equation  is  -  e  u"  +  u  =  f,  or  -  u"  +  e  u  =  f,  should  you  drop  the 
e-term,  or  not?  (Think  of  e  as  related,  for  instance,  with  the  skin-depth.)  The  answer  to 
this  particular  question  is  known,  but  lots  of  similar  ones  are  still  to  be  addressed.  Here  is  a 
rich  field  of  study,  full  of  prospects  for  both  intellectually  stimulating  mathematical 
theories  and  usable  practical  recipes. 

Yet,  case  studies  may  prove  more  effective  than  comprehensive  studies  of  large 
scope,  at  this  stage.  One  such  study  comes  from  the  experience  of  a  small 
electromagnetism  community  (the  "TEAM  workshop",  [22,  23])  whose  emphasis  is  on 
low-frequency  and  static  situations.  The  test-case,  described  below  in  general  terms,  is  a 
non-linear  magnetostatics  problem  on  which  the  same  observation  has  been  made  by 
several  independent  investigators:  numerical  methods  based  on  the  use  of  the  vector 
potential  tend  to  perform  better  than  those  using  a  scalar  potential.  This  calls  for  an 
explanation,  which  is  the  theme  of  the  present  paper.  In  short,  there  are  two  "small 
parameters"  in  this  problem,  the  ratio  of  the  airgap  to  some  characteristic  length,  and  the 
ratio  of  permeabilities  in  the  air  and  in  the  iron.  Their  interplay  results  in  the  possibility 
of  formulating  the  problem  as  a  perturbation  with  respect  to  some  similar,  much  simpler 
problem.  This  is  a  standard  idea.  What  is  new  here  is  that,  due  to  some  symmetry  inherent 
in  Maxwell  equations,  there  are  two  ways  to  do  this,  based  respectively  on  the  use  of  the 
scalar  and  of  the  vector  potential,  and  that  the  nature  of  the  perturbation  is  not  the  same  for 
both:  it's  "regular"  perturbation  in  the  former  case,  "singular”  perturbation  in  the  latter. 
An  attempt  is  made  to  link  these  facts  with  the  numerical  observations. 

Similar  situations  where  two  small  parameters  are  "competing",  each  one  driving  the 
problem  toward  different  limits,  seem  to  be  common  to  other  areas  of  electromagnetics, 
for  example,  skin  effect,  current  flow  around  a  crack,  and  even  (for  higher  frequencies) 
surface  impedance. 


’H-ORIENTED’  AND  B -ORIENTED'  METHODS  IN  MAGNETOSTATICS 

We  shall  consider  the  following  general  situation  (Fig.  1):  a  coil  C  (sustaining  a 
given,  permanent,  current-density  j8),  and  a  ferromagnetic  piece  M.  The  whole  space  is 
E,  and  A  =  E  -  C  -  M  is  the  air-region.  A  non-linear  behavior  law  b  =  T(h)  is  given  (no 
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hysteresis).  For  the  present  discussion,  we  assume  a  T  such  that  vectors  b(x)  and  h(x) 
are  collinear  at  all  points  x  in  M,  and  that  b(x)  vanishes  if  h(x)  does.  One  is  requested 
to  compute  the  fields  b  and  h.  The  relevant  equations  are 

(1)  curl  h  =  js  inE,  b=r(h),  divb  =  0  in  E. 


Figure  1.  A  typical  problem  (left).  Definition  of  surface  E  and  domain  M,  to  be 
used  in  the  sequel  (right). 


We  are  only  interested  here  in  comparing  the  accuracy  achievable  with  various 
methods,  not  in  things  like  the  computing  time,  the  number  of  steps  in  iterative 
procedures,  etc.  So  we  may  consider  a  linear  problem  with  the  same  solution  as  ( 1), 
namely 

(2)  curl  h  =  j8  in  E,  b  =  p  h,  div  b  =  0  in  E, 

where  p  is  the  function  x  — >  lb(x)l/lh(x)l,  as  computed  from  the  actual  solution  { b,  h } . 
Whatever  can  be  said  about  the  merits  of  various  methods  as  regards  (2)  will  thus  be 
relevant  to  (1). 

The  ratio  p(x)/Po  is  a  feature  of  the  solution  (and  as  such,  it  depends  on  the  imposed 
current  js).  It  is  usually  high  (about  2000,  typically),  but  tends  to  decrease  when  the 
driving  current  j8  is  increased  ( saturation  phenomenon).  Let  us  call  e  the  average  of 
Po/p(x)  over  M.  We  define,  for  all  x  in  M,  p,(x)  =  e  p(x).  Note  that  p,  is  close  to  Po, 
except  perhaps  in  saturated  zones.  This  £  is  one  of  the  small  parameters  alluded  to  in  the 
Introduction.  It  will  play  an  important  role  in  what  follows.  The  other  small  parameter  is 
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the  ratio  d/L  of  the  gap-width  d  to  the  average  length  L  of  the  magnetic  circuit  M.  We 
shall  assume  that  d/L,  though  small,  is  still  large  with  respect  to  £  thus,  the  reluctance  of 
the  whole  device  is  mainly  due  to  the  air  gap.  This  is  the  assumption  used  in  the  TEAM 
workshop  study,  and  is  the  rule  in  problems  of  this  kind.  (But  the  assumption  is  a  crucial 
one:  if  the  reluctance  was  mainly  due  to  the  iron  core,  this  would  modify  our  conclusions.) 

Although  the  discussion  will  be  limited  to  linear  problems,  we  are  not  losing  anything 
significant  in  generality,  because  one  can  always  tackle  ( 1 )  by  iterating  on  |i,  Newton- 
Raphson  style,  a  linear  system  being  solved  at  each  step.  In  fact,  almost  all  methods  in 
actual  use  in  nonlinear  magnetostatics  seem  to  rely  on  this  approach.  They  fall  into  two 
main  families:  ’b-oriented'  methods,  which  yield  b  (for  instance  by  computing  the 
magnetic  vector  potential),  and  'h-oriented'  methods,  which  aim  at  h  (for  instance  by 
computing  the  magnetic  scalar  potential).  For  the  sake  of  definiteness,  let  us  formalize 
this: 

Definition  1.  A  vector  field  h  [resp.  b]  is  said  "curl-conformal"  [resp.  "div- 
conformal "J  if  its  tangential  [resp.  normal]  part  is  continuous  across  all  surfaces  in  its 
domain  of  definition. 

Definition  2.  A  method  will  qualify  as  "h-oriented"  [resp.  as  "b-oriented"]  if  it 
computes  h  [resp.  b ]  in  such  a  way  that  its  tangential  [resp.  normal]  continuity  is  exactly 
enforced,  i.e.,  if  it  yields  a  curl-conformal  h  [resp.  a  div-conformal  b]. 

Let  us  develop  an  example  of  b-oriented  method.  A  tetrahedral  mesh  is  first 
designed,  large  enough  to  cover  M,  C,  and  enough  of  the  air-region  to  justify  neglecting 
what  happens  outside  the  meshed  volume.  Let  be  the  set  of  nodes.  Let  Xn  be  the 
"hat-function"  associated  with  node  n  (i.e.,  the  unique  continuous,  piecewise  affine, 
function  equal  to  1  at  node  n  and  to  0  at  all  other  nodes).  Call  IP1  the  set  of  all  vector 
fields  of  the  form 

(3)  a=  Zn  6  ^an  A.n, 

where  the  degrees  of  freedom  (DoF)  an  are  vectors,  one  per  node.  Now  look  for  a  in 
IP1  such  that 

(4)  JE|i_l  curl  a  .  curl  a'  =  jE  je  .a'  V  a'  e  IP1 

(the  so-called  "weak  form"  of  the  equation  curl((i_l  curl  a)  =  jp).  This  is  a  linear  system  in 
terms  of  the  3x  N  components  of  the  ans  (N,  the  number  of  nodes  in  the  mesh).  There 
exists  a  solution  (because  div  js  =0),  perhaps  not  unique,  but  anyway,  all  solutions  will 
have  the  same  curl.  (Numerical  difficulties  that  one  may  encounter  in  solving  (4)  are  not 
our  concern.)  This  is  a  b-oriented  method,  because  it  gives  b  =  curl  u,  thus  enforcing  the 
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essential  requirement  about  the  continuity  of  the  normal  component  of  b  across  all 
surfaces,  including  the  faces  of  the  mesh.  The  field  h  =  p~'  curl  a  fails  to  have  the 
symmetric  property  of  continuity  of  its  tangential  part  through  surfaces.  This  is  so 
because  the  relation  curl  h  =  jg  is  only  weakly  enforced  by  (4). 

On  the  other  hand,  the  well-known  "single  magnetic  potential"  method  is  h-oriented. 
Its  principle  is  to  look  for  the  field  h  in  the  form  h  =  grad  (p  +  hs,  where  h8  is  a  field  that 
satisfies  roth8=j8  and  divh8  =  0.  (Such  a  field  can  be  constructed  from  j8  byBiot- 
Savart  integration.)  The  unknown  in  this  problem  is  thus  the  potential  q>.  According  to 
the  general  finite  element  approach,  one  thus  has  to  find  a  function  <p  of  the  form 

(5)  <p  =  <fcA 

such  that 

(6)  jE  p  (grad  <p  +  hg) .  grad  <p"  =  0 

for  all  test-functions  tp"  themselves  of  the  form  (5).  (This  is  a  way  to  enforce  the 
condition  div(ph)  =  0,  in  weak  form.)  Here  the  <pns  are  scalar  degrees  of  freedom.  This 
time  h  has  the  required  tangential  continuity,  but  b  =  p  h  fails  to  have  normal  continuity, 
because  div  b  =  0  is  only  weakly  enforced:  the  method  is  "h-oriented". 

Remark.  Note  that  (p  is  single-valued  here.  There  are  variants  in  which  a  similar,  but 
possibly  multivalued  magnetic  potential  is  used  [19],  or  more  than  one  potential  [18]. 
Clearly,  such  methods  also  are  h-oriented. 

Not  all  methods  fall  in  one  of  the  previous  categories.  A  method  can  have  both 
orientations.  But  in  that  case  the  behaviour  law  b  =  p  h  will  fail  to  be  exactly  satisfied,  in 
general.  (This  point  is  developed  in  [5].) 

Now  the  scene  is  set.  We  shall  argue  as  follows:  "b-oriented"  methods  should  in 
general  work  better,  because  the  smallness  of  the  ratio  P(/p  results  in  a  regularly 
perturbed  problem  with  such  formulations,  whereas  "h-oriented"  methods  suffer  from 
singular  behavior  with  respect  to  this  same  parameter.  Therefore,  they  are  more  sensitive 
to  numerical  error. 


SINGULAR  VS.  REGULAR  PERTURBATIONS 

Let  us  first  recall  the  difference  between  regular  and  singular  perturbation,  with  the 
help  of  a  simple  example.  Suppose  we  have  to  solve  the  two-points  boundary -value 
problem 
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(7) 


-  u"  +  e  k(x)  u  =  f ,  u'(0)  =  0,  u(l)  =  0, 


for  a  numerically  small  value  of  £,  and  a  given  positive  function  k.  Note  that  this  is 
equivalent  to  minimizing  the  functional  Fe(v)  =  i,(lv'l2  +  £k  Ivl2- 2  f  v)  among  all 
those  in  the  set  V  =  { v  €  C°[0, 1] :  J[0  ,j  Iv'l2  <  °o,  v(l)  =  0} ,  which  can  easily  be  done  by 
numerical  methods.  But  instead,  in  order  to  take  advantage  of  e’s  smallness,  one  would 
rather  make  use  of  the  "limit-problem",  corresponding  to  £  =  0,  whose  solution  u0  is 
obtained  directly,  by  quadratures,  without  having  to  solve  a  linear  system.  The  idea  is  to 
expand  u  as  u0  +  £  u,  +  ...,  and  to  throw  this  into  (7):  again,  u ,  can  be  obtained  by 
straightforward  integration,  the  same  way  u0  was.  And  so  on.  Problem  (7)  is  a 
perturbation  of  a  simpler  one  (namely,  to  find  u  such  that  -  u"  =  f,  u'(0)  =  0,  u(l)  =  0), 
and  therefore  can  be  solved  by  a  cascade  of  similar  problems.  Note  that  u0  still  achieves 
the  minimum  of  some  functional  (namely,  F0(v))  over  the  same  set  V.  This  is  what  makes 
the  perturbation  "regular":  the  limit  solution  is  solution  to  the  limit  problem,  as  obtained 
by  setting  £  =  0  in  either  (7)  or  its  equivalent  variational  formulation,  "minimize  Fe(v) 
over  V".  Note  also  that  ue  and  its  first  derivative  both  converge  (in  the  sense  of  quadratic 
means)  to  u0  and  u'0  respectively. 

In  contrast,  "singular  perturbation"  occurs  with 
(8)  -  £  u"  +  k(x)  u  =  f,  u'(0)  =  0,  u(l)  =  0. 

If  one  wants  to  solve  this  numerically,  the  smallness  of  £  imposes  a  small  discretization 
step,  which  is  cumbersome,  so  making  use  of  the  limit  problem  seems  again  a  good  idea. 
But  now  two  things  go  wrong.  First,  the  would-be  "limit-problem",  i.e.,  (7)  with  £  =  0,  or 
its  variational  version,  "minimize  J(0  ,j  (Ik  Ivl2-  2  f  v)  over  V",  has  no  solution.  Next, 
when  one  relaxes  its  requirements  by  dropping  the  boundary  conditions,  the  solution  u0 
of  the  relaxed  problem,  "minimize  over  L2([0, 1])",  which  is  Uo  =  f,  is  not  the  limit  of  ue 
in  the  above  sense:  only  uc  converges  towards  u0  in  L2,  whereas  u'E  does  not  converge  to 
u'o  (Fig.  2). 

Singular  perturbation  is  thus  characterized  by  the  fact  that,  when  passing  to  the  limit, 
the  solution  "escapes"  from  the  set  where  it  naturally  lives,  and  while  it  still  does  converge 
to  something,  it  is  in  a  weaker  sense,  and  to  an  object  which  belongs  to  a  larger  universe.  It 
does  not  mean  that  the  (non-acceptable)  "limit  solution"  u0  is  useless  as  a  stepping  stone  to 
obtaining  the  true  solution,  u£,  of  (8).  Indeed,  u0  is  a  term  in  some  asymptotic  expansion 
of  uE,  that  can  be  derived  via  a  moderately  involved  process  called  "matching  asymptotic 
expansions".  (Some  references  to  these  things  are  [  10,  14, 16,  20,  21].)  Quite  often,  a  lot 
of  numerical  work  can  be  avoided  by  such  an  asymptotic  approach  to  the  solution.  This  is 
so  because,  from  the  numerical  point  of  view,  singularly  perturbed  problems  are  tougher 
than  regularly  perturbed  ones.  In  the  case  of  (8)  vs.  (7),  a  glance  at  Fig.  2  shows  why:  the 
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step-size  has  to  be  very  small,  at  least  in  the  "boundary  layers"  near  the  ends  of  the 
interval. 


Figure  2.  Solution  of  (7)  for  e  small  (k  =  1). 


PROBLEM  (2)  AS  A  PERTUBED  PROBLEM 

For  reference,  let  us  write  weak  formulations  analogous  to  (4)  and  (6),  as  they  stand 
before  any  commitment  to  particular  finite  elements  has  been  made.  Call  A  (resp.  O)  the 
set  of  all  square-integrable  ''sctor  fields  (resp.  functions),  with  a  square  integrable  curl 
(resp.  gradient),  over  the  whole  space.  The  weak  formulations  are,  for  the  (b-oriented) 
"a-method":  find  ae  g  A  such  that 

JE  |Tl  curl  ae .  curl  a'  =  jE  j8  .a'  V  a'  e  A 

(which  can  be  rewritten  as 

(9)  Je-  m  Mo  1  curl  \  ■  curl  a'  +  e  JM  Mi  '  cur*  ae  •  curl  a’  =  4 js  • a'  V  a'  g  A), 
and  for  the  (h-oriented)  9 -method:  find  (p£e  such  that 
1E  p  (grad  <pE  +  hg) .  grad  cp'  =  0  V  9'  g  O 
(which  can  be  rewritten  as 

( 4-  m  Mo  (grad  <pe  +  hs)  •  grad  9’  +  e_l  JM  p,  (grad  9,  +  h8) .  grad  9’ 

=  0  V  9'  g  <I>). 
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Remark,  one  should  compare  (9)  with  the  following: 

J  9xue  9xu'  +  e  /  k  u  u'  =  J  f  u'  V  u'  e  U, 

which  is  equation  (7)  in  weak  formulation.  The  analogy  we  rely  on  is  due  to  the  presence 
of  the  small  parameter  e  in  front  of  the  second  integral  in  both  cases. 

Let  us  now  see  in  which  way  our  problem  (2),  when  formulated  cs  (9)  or  as  (10),  can 
be  considered  a  "perturbation"  of  some  simpler  one.  A  difficulty  is  that  there  are  two 
small  parameters  in  the  picture:  the  above  ratio  £  (~  jio/fJ.),  and  the  ratio  d/L  of  the 
gap- width  to  the  circuit-length.  The  analysis  in  terms  of  two  such  "competing"  small 
parameters  is  quite  interesting;  it  leads — depending  on  which  parameter  dominates — to 
very  different  limit  models  (called  "significant  degenerations"  in  [10]),  which  do  have 
meaningful  interpretations  within  the  present  context.  (See  an  application  to  skin-effect, 
and  to  the  concept  of  "surface  impedance"  [8],  in  [4].)  There  is  no  space  here  to  justify  our 
main  assumption  that  £,  not  d/L,  is  the  dominant  small  parameter  in  the  present  case:  this 
would  call  for  some  lengthy  dimensional  analysis  (whose  principles,  anyway,  are  familiar 
to  all  readers).  We  cannot  either  dwell  on  the  Taylor  expansions  in  terms  of  £,  and  similar 
techniques,  by  which  models  (11)  and  (12)  below  are  derived:  although  these  are  familiar 
and  elementary  mathematical  tools,  their  application  to  the  present  case  requires  a  heavy 
apparatus  of  functional  analysis,  that  seems  out  of  place.  So  we  shall  proceed  more 
dogmatically:  first,  some  notation,  next  a  terse  statement  of  the  results,  then  some  a 
posteriori  justification. 

Let  us  call  M  the  union  of  M  (cf.  Fig.  1)  and  the  gap,  and  Z  a  surface  congruent  to 
the  pole  surfaces,  located  in  the  middle  of  the  gap.  Enlarge  space  O  by  accepting  into  it 
functions  which  are  allowed  to  be  discontinuous  across  Z.  Call  <I>  the  bigger  space  so 
obtained. 

Now,  the  limit  problems.  The  one  in  terms  of  the  vector  potential  a  splits  into  two 
successive  problems:  find  a„€  A  such  that  both  conditions 

I  Je-m  1*0  '  curl  ao  •  curl  a  =  Jh  j®  ■  a  V  a’  e  A, 

(ID  I 

I  curl(|i,_1  curl  a0)  =  0  in  M, 
hold.  The  one  in  tp  is  find  cp0  €  such  that  both  conditions 

1  JE-m  Ho  (grad  <Po  +  h8)  •  grad  9'  =  0  V  (p'  e 
(12)  I 

I  grad  <p0  +  h8  =  0  in  M-Z, 
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hold.  Then  b  =  curl  a*,  and  h  =  grad  <p0  +  h8  are  the  limit  solutions. 

Why,  on  physical  grounds?  Imagine  the  air  gap  is  shrunk  to  2,  while  the  ratio  e  = 
Po/ji,  tends  to  zero.  Then  h  tends  to  zero  in  the  core,  hence  the  second  line  of  (12),  and 
(since  its  circulation  is  held  constant)  tends  to  infinity  in  the  air  gap,  hence  the  eventual 
discontinuity  of  the  magnetic  potential.  As  for  b,  one  has  divb  =  0  and  curl(p0_l  b)  =  j8 
out  of  the  core,  and  its  flux  lines  impinge  orthogonally  to  the  surface  of  the  core  in  the  case 
of  an  infinite  permeability:  indeed,  the  first  line  of  (1 1)  implies  all  that.  The  second  line 
of  (1 1)  describes  what  happens  inside  the  core,  and  (since  the  tangential  values  of  a  are 
now  known,  from  solving  (11),  first  line)  constitutes  a  well-posed  problem.  (All  this, 
though  it  may  not  be  obvious,  relies  on  the  above  observation  that  d/L  is  large  with 
respect  to  e.) 


Remark.  Neither  (1 1)  or  (12)  are  practical  ways  to  solve  the  problem,  if  only  because  of 
the  non-linearity:  p, ,  of  course,  is  not  known  in  advance,  so  no  useful  information  on  the 
field  in  the  core  could  be  obtained  that  way.  (The  stray-field  thus  obtained,  however, 
might  be  reasonably  accurate.)  We  do  not  advocate  the  actual  use  of  limit  models  in  the 
specific  case  of  Pb.  13.  We  just  expect  to  get  insight — about  how  difficult  it  will  be  to 
solve  the  true  problem — from  an  examination  of  the  limit  one.O 

Now  it  should  be  clear  that  the  limit  process  from  (9)  to  ( 11)  is  regular,  while  the  one 
from  (10)  to  (12)  is  singular:  for  in  the  latter,  there  is  this  tendency  of  <pe  to  "escape 
from"  space  <J>  (towards  the  larger  one  <I>),  while  nothing  similar  happens  to  aE,  which 
stays  in  A. 

The  difference  can  be  seen  more  concretely  from  what  happens  in  the  air  at  the 
boundary  of  2:  a  singularity  of  cp  (actually,  a  jump-discontinuity),  whereas  a  is  regular 
(Fig.  3).  Figure  3  is  two-dimensional  for  convenience,  but  what  is  described  here  is  a 
feature  of  the  three-dimensional  situation:  the  magnetic  potential  changes  very  rapidly 
across  the  airgap  (Fig.  3,  right).  So,  if  one  tries  to  model  the  situation  by  treating  the 
airgap  as  a  surface,  across  which  <p  can  be  discontinuous — which  is  precisely  model 
(12) — the  computed  (p  will  look  as  on  the  left  part  of  the  figure,  with  an  obvious 
singularity.  Our  point  is  that,  although  this  limit  model  is  not  the  one  which  is  actually 
solved  for,  the  latter  is  sufficiently  closed  to  it  for  the  singular  character  of  the  limit  case  to 
be  felt,  in  various  ways,  when  actually  solving  for  (p. 

The  nature  of  the  finite  elements  has  not  been  a  factor  in  this  discussion.  So  we  may 
conclude  that  regularity  (resp.  singularity)  pertains  to  the  whole  family  of  a-methods 
(resp.  tp-methods).  Singularly  perturbed  problems  are  notoriously  more  difficult  to  solve 
numerically  than  regularly  perturbed  ones.  Hence  the  tentative  conclusion:  b-oriented 
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methods  as  a  whole  should  be  considered  with  a  favorable  bias,  for  this  particular 
problem.  Of  course,  this  should  not  be  construed  as  a  seal  of  approval  for  any  particular 
b-oriented  methods. 


Figure  3.  Flux  lines  in  the  vertical  symmetry  plane,  near  the  edge  of  the  air  gap, 
showing  the  singularity  of  the  field:  left,  as  predicted  by  the  limit  models,  right,  as  it 
really  is.  The  situation  is  nearly  two-dimensional.  In  that  case,  a  is  a  scalar,  whose 
isolines  are  precisely  the  flux-lines  shown.  Dotted  lines  are  isovalues  of  <p,  the 
magnetic  potential.  Clearly,  a  is  continuous  at  point  P,  but  <p  is  not,  in  the  limit 
model. 


CONCLUSION 

Let  us  summarize  the  main  line  of  our  argumententation.  The  problem  which  is 
actually  solved  (the  one  with  a  small  but  non-zero  e)  is  closed  to  a  singular  limit  when 
h-oriented  methods  are  used,  and  to  a  regular  limit  when  b-oriented  methods  are  used.  So 
in  the  critical  region  (near  the  edges  of  the  poles),  h-oriented  methods  are  likely  to  behave 
in  a  nastier  way  than  b-oriented  ones.  For  instance,  they  may  require  a  finer  mesh  if  the 
same  precision  is  to  be  achieved.  But  meshes,  as  a  rule,  are  not  what  would  be  desirable, 
they  are  what  computing  resources  allow.  So  we  may  expect,  on  the  average,  better 
precision  from  b-oriented  methods  in  reported  numerical  experiments.  This  seems 
indeed  to  be  the  present  trend,  as  regards  this  particular  problem. 

We  have  seen  how  an  analysis  of  the  nature  of  the  perturbation  (regular  or  singular) 
in  a  small  parameter  problem  can  be  used  to  predict  the  relative  success  of  a  particular 
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family  of  numerical  methods  with  respect  to  another.  This  was,  at  best,  heuristics.  Can  the 
reasoning  eventually  be  refined  into  one  that  would  yield  precise,  provable,  statements? 
Perhaps,  but  I  tend  to  think  efforts  in  this  direction  would  be  misdirected.  In  the  present 
state  of  the  art,  it  seems  better  to  view  the  above  explanation  as  a  route  towards  a 
conjecture ,  to  be  confirmed  or  rejected  on  the  basis  of  numerical  experiments:  that,  for 
these  problems  of  non-linear  magnetostatics  with  high  relative  permeability,  "b-oriented 
methods  work  better". 
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